Posts: 4

Extract Blog Post Links from RSS feeds

As part of my goal of automation here, I wrote a small script to extract blog post links from RSS feeds. using Python. I did this to extract the title and link of blog posts from a particular date range in my RSS feed. In theory, it should be pretty easy but I've come to find that time was not my friend.

What tripped me up was how some functions in python handle time objects. Read on to learn more!

What it does

What this script does is first scrape my RSS feed, then use a 7 day date range to extract the posted blog titles and links, and then writes it to a markdown file. Super simple, and you'll need the feedparser library installed.

The real trick her is not the loop, but the timetuple(). This is where I first got tripped up.

I first created a variable for today's date and another variable for 7 days before, like so:

import datetime as DT
import feedparser

today = DT.date.today()
week_ago = today - DT.timedelta(days=7)

The output of today becomes this: datetime.date(2018, 9, 8)
The output of week_ago becomes this: datetime.date(2018, 9, 1)

So far so good! The idea was to use a logic function like if post.date >= week_ago AND post.date <= today, then extract stuff.

So I parsed my feed and then using the built in time parsing features of feedparser, I wrote my logic function.

BOOM, it didn't work. After sleuthing the problem I found that the dates extracted in feedparser were a timestruct object whereas my variables today and week_ago were datetime objects.

Enter timetuple() to the rescue. timetuple() changed the datetime object into a timestruct object by just doing this:

t = today.timetuple()
w = week_ago.timetuple()
<p>After that, it was straightforward to do the loop and write out the results, see below.</p>
<h2>Python Script</h2>
<pre><code class="language-python">
import datetime as DT
import feedparser

today = DT.date.today()
week_ago = today - DT.timedelta(days=7)

#Structure the times so feedparser and datetime can talk
t = today.timetuple()
w = week_ago.timetuple()

#Parse THE FEED!
d = feedparser.parse('http://www.neuralmarkettrends.com/feeds/all.atom.xml')

#Create list to write extract posts into
output_posts = []
for pub_date in d.entries:
    date = pub_date.published_parsed
    #I need to automate this part below
    if date &gt;= w and date &lt;= t:
        tmp = pub_date.title,pub_date.link


#Write to File
date_f = str(DT.date.today())
f = open (date_f + '-posts.md', 'w')
for t in output_posts:
    line = ' : '.join(str(x) for x in t)
    f.write(line + 'n')


Python Script to Parse Blog Feeds

I recently wrote a small a python script to parse blog feeds and then tweet them out via Twitter. It randomly takes the first 5 RSS entries of a feed and them tweets one out. You'll need to get an API key from Twitter and the credentials, but it's a neat way to keep your readers updated of the various feeds you read or write. Just replace the 'XXXXXXXXXXXXXXXXXXXXXX' with the various keys you get from Twitter.

    import feedparser
    import sys
    import os
    import random

    from twython import Twython, TwythonError


    num = random.randint(0,5)

    d = feedparser.parse('http://www.neuralmarkettrends.com/feed')


    status_text = 'Fresh: ' + d['entries'][num]['title'] + ' link: '+ d['entries'][num]['link'] + ' #NMT'

    except TwythonError as e:
        print (e)


RSS vs Sharing

Trends come and go in the blink of an eye these days. Usually some market disrupter comes along and changes the game with a shiny new thing. Sometimes it's a service or product that gets shutdown. That's exactly what happened here.

This is the story of Google Reader and the rise of Social Sharing.

Before Social Sharing was the "thing", RSS was the life blood of blogging. Many bloggers (myself included) reached thousands of RSS feed readers daily. I had over 1,000 RSS readers during my heyday but over the past few years I watched that number dwindle to maybe 12 people now. What happened? Some of that is because my posting frequency went from a regular daily/weekly cadence to not posting anything for months, and some of it was that I didn't catch the Social Sharing wave.

Blogging was in its heyday sometime between 2011-2012 and new platforms like Tumblr and Posterous appeared. Facebook and Twitter were establishing new territory in the social space and gobbling up users like no tomorrow. I experimented with with them all but there was no real focus with each one of them at that time. The fight for "mindshare" was heated in this space and blogging began to ebb in favor of 140 character tweets and reposting on Tumblr. Existing blogging platforms like Wordpress and Drupal did grow, but morphed themselves into strong content management systems.

The tipping point came in 2013 when Google killed off Google Reader. RSS was still around as a technology but blogging began to fragment. Some bloggers established Twitter accounts, other's went to Tumblr, and some just went to Facebook. Many blogs shutdown or were abandoned. It was just horrible.

Some blogs did survive and some actually embraced these new distribution channels. Suddenly, "Share This" buttons began to appear at the end or at the top of every blog post. You could tweet out a link of an article that you liked and it could be retweeted amongst your followers. You could share another post with your Facebook friends or group members. You could even email the post to co-workers if you chose too. You had complete control over how you wanted to share content. You could consume content and didn't need an RSS feed reader to do it in.

The void that Google Reader left behind quickly filled up with Social Sharing Buttons.

The best part? New conversations emerged! New ideas were shared and blog posts changed from a static entries into living and breathing content. No longer did you have to go to the comment section on a post to interact with the blogger, now you could tweet a link, post to FB, get a social conversation going. Blogging had emerged into a brave new world. It grew up!

Now go share this post. :)


Neural Market Trends is the online home of Thomas Ott.