Bash で HTML のタイトルを取得する

published: 2018.10.01 / modified: 2018.10.02

いくつかの Linux コマンドを使って、Webページやローカル上の HTML のタイトル（title 要素の内容）を取得する。

cat と grep と sed コマンドで取得

以下は、ローカルにあるHTMLファイルの title 要素の内容を抽出する例。

cat index.html | grep -o '<title>.*</title>' | sed 's#<title>\(.*\)</title>#\1#'

cat と grep と sed コマンドを使用。
title 要素内で改行があると、上手に抽出できない。

以下は、Webページのタイトルを取得する例。

curl -s http://example.com | grep -o '<title>.*</title>' | sed 's#<title>\(.*\)</title>#\1#'

curl と grep と sed コマンドを使用。
title 要素内で改行がある場合、上手に取得できない。