Blog Creating a Custom Excerpt with Two Paragraphs
July 17, 2013 Development, HTML, PHP Social Share
The code structure of my news articles is:
<p><a href="http://variety.com/2013/film/games/michael-bay-to-develop-ubisofts-ghost-recon-film-at-warner-bros-exclusive-1200491557/" target="_blank">Variety</a> is reporting that Ubisoft has hired <a href="http://www.imdb.com/name/nm0000881/" target="_blank">Michael Bay</a> to develop Ghost Recon which is now setup at Warner Bros.</p>
<p>Bay could direct the film but he's waiting on a script to be written first. Ubisoft plans to hire a screenwriter later this month and if all goes according to plan, will start hiring actors in July.</p>
Since I kept it pretty simple, I only needed to print out up to the second occurrence of the closed paragraph tag. For this, I used the strpos function which is used to find the starting position of an item in a variable.
For any search, we need to define our haystack (the item that we’ll be searching in) and our needle (the item that we’ll be searching for). I passed in the $content variable which includes the full contents of my news article as the haystack. Then, I passed in the closed paragraph tag as my needle to find the end of the first paragraph.
$needle = '</p>';
$first_p = strpos($content, $needle);
This gave me the first closed paragraph tag’s position, but what about the second paragraph. We will use the strpos() function again, but this time I passed in an offset as the third item.
An offset provides a new spot to begin our search instead of starting from the beginning of the haystack. We are hiding what we have already searched through and looking for the next needle.
The position of the needle’s first occurrence combined with the length of our needle acts as the offset below. Since the strpos() function gave us the starting position of our needle, we need to add the length of our needle to it in order to fully hide what was already searched. The length of the closed paragraph tag is 4 with the following characters: <, /, p and >.
$second_p = strpos($content, $needle, $first_p + strlen($needle));
Now we have the position of where the second paragraph tag is closed. Using this information, we can use the substr() function which provides a shortened version of our variable, also called a sub string.
Now we’re ready to grab the text of our first two paragraphs. Again, we need add the length of our needle to the search to ensure we include the entire closed paragraph tag.
Starting at position 0, we will grab all of the characters until the end of our needle.
$excerpt = substr($content, 0, $second_p + strlen($needle));
Stored in the $excerpt variable, we now have our text.
However, some news articles had an open blockquote tag before the start of the second paragraph.
<p>The first paragraph.</p>
<blockquote>
<p>The second paragraph.</p>
<!-- Our excerpt would stop here after the end of </p> -->
</blockquote>
I again used the strpos() function to check if our newly created excerpt had an open <blockquote> tag. If the <blockquote> is found, I needed to add a closed </blockquote> tag to my excerpt so nothing breaks. The lowercase n in quotes is called a newline character and is used to push the closed </blockquote> tag to the next line in the HTML.
if(strpos($content, '<blockquote>') !== FALSE)
{
$excerpt .= "\n" . '</blockquote>';
}
The excerpt was now complete. This same technique can be used to shorten list items and only show the first two or three bullet points. Full code can be found below.
$needle = '</p>';
$first_p = strpos($content, $needle);
$second_p = strpos($content, $needle, $first_p + strlen($needle));
$excerpt = substr($content, 0, $second_p + strlen($needle));
if(strpos($content, '<blockquote>') !== FALSE)
{
$excerpt .= "\n" . '</blockquote>';
}
// The code can be shortened by removing some variables
$excerpt = substr($content, 0, strpos($content, '</p>', strpos($content, '</p>') + 4) + 4);
if(strpos($content, '<blockquote>') !== FALSE)
{
$excerpt .= "\n" . '</blockquote>';
}