lojjic.net

Home of Jason Johnston

Monday, November 24, 2003

URI Parsing in XSLT

In creating the XSLT transformations for this site, I often ran into situations where I needed to extract either the path or the filename from a URI. In most programming/scripting languages this is easy to do with standard string manipulation or regular expression functions, but such things are not easily done in XSLT as anyone who's used it extensively knows. (XSLT 2.0 should supposedly make it much easier, but it'll be a while before it's standardized and supported by XSLT tools.)

The solution I found is to create some named templates, which recursively call themselves to process the URI piece-by-piece. Here are my functions:

<xsl:template name="getPath">
  <xsl:param name="uri" />
  <xsl:if test="contains($uri,'/')">
    <xsl:value-of select="concat(substring-before($uri,'/'),'/')" />
    <xsl:variable name="afterSlash" select="substring-after($uri,'/')" />
    <xsl:call-template name="getPath">
      <xsl:with-param name="uri" select="$afterSlash" />
    </xsl:call-template>
  </xsl:if>
</xsl:template>

<xsl:template name="getFilename">
  <xsl:param name="uri" />
  <xsl:choose>
    <xsl:when test="contains($uri,'/')">
      <xsl:variable name="afterSlash" select="substring-after($uri,'/')" />
      <xsl:call-template name="getFilename">
        <xsl:with-param name="uri" select="$afterSlash" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$uri" />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

And they might be used like:

<xsl:variable name="theURI">http://lojjic.net/blog.rdf.html</xsl:variable>

<xsl:text>The URI: </xsl:text>
<xsl:value-of select="$theURI" />

<xsl:text> --- The Path: </xsl:text>
<xsl:call-template name="getPath">
  <xsl:with-param name="uri" select="$theURI" />
</xsl:call-template>

<xsl:text> --- The Filename: </xsl:text>
<xsl:call-template name="getFilename">
  <xsl:with-param name="uri" select="$theURI" />
</xsl:call-template>

Which should give:

The URI: http://lojjic.net/blog.rdf.html --- The Path: http://lojjic.net/blog/ --- The Filename: index.rdf.html

Posted by Jason at 10:10:57 - [ 1 Comment ]