PythonでXMLを扱う(8) | Never Too Late

では今回で簡単にXSLTの動きを見たので、次回からはこれをPythonに組み込んでいく方法を勉強する。

と前回書いたが、やはりもう少しXSLTの動きを見てみようと思う。

—–

今回は以下のindustry.xmlをXSLTを使ってHTMLに変換する。スタイルシートであるtrans-industry.xslを変化させることによるHTMLの変化を見ていくと、XSLTの動きが分かるかと思う。

industry.xml

<?xml version="1.0"?>
<industry area="Automobile">
<company name="Toyota">
<headquarters>Aichi</headquarters>
<employee>100,000</employee>
<famous>COROLLA</famous>
</company>
<company name="Nissan">
<headquarters>Kanagawa</headquarters>
<employee>70,000</employee>
<famous>SKYLINE</famous>
</company>
<company name="Mazda">
<headquarters>Hiroshima</headquarters>
<employee>50,000</employee>
<famous>RX-7</famous>
</company>
</industry>

trans-industy.xsl

<xsl:stylesheet xmlns:xsl=" http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/">
<html>
<head>
<title>The Japanese Industry of <xsl:apply-templates /></title>
</head>
</html>
</xsl:template>
<xsl:template match="industry">
<xsl:value-of select="@area"/>
</xsl:template>
</xsl:stylesheet>

一番簡単な例。このindustry.xmlをtrans-industry.xslを使って変換すると、以下のようなタイトルだけのHTMLファイルが出来あがる。タイトル名にindustryタグのarea属性の値が使われたのが分かる。

<html>
<head>
<meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type">
<title>The Japanese Industry of Automobile</title>
</head>
</html>

次にtrans-industry.xslに部を追記し、タイトルだけでなくHTMLの中身も表示する。

trans-industry.xsl

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/">
<html>
<head>
<title>The Japanese Industry of <xsl:apply-templates mode="title" /></title>
</head>
<body>
<center>
<xsl:apply-templates mode="table" />
</center>
</body>
</html>
</xsl:template>
<xsl:template match="industry" mode="title">
<xsl:value-of select="@area"/>
</xsl:template>
<xsl:template match="industry" mode="table">
<table border="1">
<tr><th>Company</th>
<th>Headquarters</th>
<th>Employee</th>
<th>Famous Car</th>
</tr>
</table>
</xsl:template>
</xsl:stylesheet>

結果は以下のようになる。HTMLファイルをブラウザで開くと、表のヘッダ行だけ表示されているはず。

<html>
<head>
<meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type">
<title>The Japanese Industry of Automobile</title>
</head>
<body>
<center>
<table border="1">
<tr>
<th>Company</th>
<th>Headquarters</th>
<th>Employee</th>
<th>Famous Car</th>
</tr>
</table>
</center>
</body>
</html>

さらにtrans-industry.xslに追記することにより、先ほどヘッダ行だけだった表にindusty.xmlのデータを表示する。

trans-industry.xsl

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform "
version="1.0">
<xsl:template match="/">
<html>
<head>
<title>The Japanese Industry of <xsl:apply-templates mode="title" /></title>
</head>
<body>
<center>
<xsl:apply-templates mode="table" />
</center>
</body>
</html>
</xsl:template>
<xsl:template match="industry" mode="title">
<xsl:value-of select="@area"/>
</xsl:template>
<xsl:template match="industry" mode="table">
<table border="1">
<tr><th>Company</th>
<th>Headquarters</th>
<th>Employee</th>
<th>Famous Car</th>
</tr>
<xsl:apply-templates />
</table>
</xsl:template>
<xsl:template match="industry/company">
<tr><td><xsl:value-of select="@name" /></td>
<td><xsl:value-of select="headquarters" /></td>
<td><xsl:value-of select="employee" /></td>
<td><xsl:value-of select="famous" /></td>
</tr>
</xsl:template>
</xsl:stylesheet>

結果は以下のようになる。表の内容としてindustry.xmlのデータが利用された、つまりindusty.xmlのデータが見やすい形（HTMLの表）として変換されたということになる。是非ブラウザで確認して欲しい。

<html>
<head>
<meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type">
<title>The Japanese Industry of Automobile</title>
</head>
<body>
<center>
<table border="1">
<tr>
<th>Company</th>
<th>Headquarters</th>
<th>Employee</th>
<th>Famous Car</th>
</tr>
<tr>
<td>Toyota</td>
<td>Aichi</td>
<td>100,000</td>
<td>COROLLA</td>
</tr>
<tr>
<td>Nissan</td>
<td>Kanagawa</td>
<td>70,000</td>
<td>SKYLINE</td>
</tr>
<tr>
<td>Mazda</td>
<td>Hiroshima</td>
<td>50,000</td>
<td>RX-7</td>
</tr>
</table>
</center>
</body>
</html>

XSLTはこのように変換に便利な機能を提供しており、ここでは書いていないが条件を組めたりだとか、イテレータ（逐次処理）が使えたりとか高度な機能も提供しているようだ。その辺りはW3Cの仕様を見てもらうと分かると思う（僕はまだ全然見ていないけれど）。

では次回こそ、PythonにXSLTプロセッサを組み込む方法を見ていく。