Tag Python

Posts: 34

Downloading SEC.GOV data

I've finally found a way to download SEC.GOV data in a consistent and less stressful way. I want to give the University of Notre Dame Software Repository for Accounting and Finance a shout out for their excellent work. Thanks to them I can finally start taming this beast.

I've struggled for years trying to figure out how to download SEC data. That repository is so wacky that it's hard to find filings in there. The Notre Dame researchers created some python scripts that will let you automate the entire process. They even included parsing scripts that will extract textual properties of the document. These properties are what I like to call "pre-labels" or draft targets for machine learning.

After parsing the 100,000's of 8-K and 8-K/A documents, I was able to generate a referernce spreadsheet, see below. This is about 90% of the way to a final training set. All I need to do is generate a label from this information and then insert the entire SEC document into the training set (without HTML tags!). Still a few more preprocessing steps but it's almost there.

8-K SEC Documents Processed

There's a couple of things to note if you want to use their scripts, you'll need to tweak them to work for you. I took me about a day to figure everything out and then organize it the way I want to. I create a /downloader folder and put the scripts in there with an /EDGAR to hold all my downloaded text files.

For processing those files I created a /dataprep folder that contains all scripts that generate the "pre-labels." My future HTML preprocessing script will sit there first before I merge them together into the main Generic_Parser.py script.

If you want to use these scripts note that they are not for commercial use. As of today the software is licensed as: "All software and data are provided on an "as is" basis, without warranties, for non-commercial purposes. The software is free for academic researchers."


Autogenerating Support and Resistance Lines for Forex

On the heels of my last post, I've extended those functions to the EURUSD pair. The data starts from this year 2019 and goes through to yesterday. It's actually a pretty neat script as it takes data from Onada and then generates the support and resistance lines for that particular pair. The next step would be to create a buy/sell order in the Oanda Practice Account. Once I do that it's then a matter of writing a trading strategy and testing it in real time.

Everything I'm doing is completely academic and modular right now. I have no idea how really build a Forex Trading Bot or even what strategies to use here. It's more of a "can I do it" endeavor.

Autogenerated Support and Resistance Lines, EURUSD

I'm fully convinced that retail traders that can learn Python can automate their entire trading strategies. Now, the flipside here is if their strategies are worth anything. Just because we can automated trading setups, we must always ask "why do I think I'm right?"


Autogenerating Support and Resistance Lines

Work has been keeping me busy but I found sometime to figure out how to autogenerate support and resistance lines. I cobbled together some code I found online and then made a simple plot. I'm doing this to help me identify the 'zones' in Forex (mostly) and see if I can automate a trading bot to make trades for me.

Here's a chart from two years worth of daily S&P500 closes. On the surface, the lines look pretty decent but the real trick is figuring out what the right lookback period should be. Here my lookback period was 20 days.

There's more work to be done, I have to fix the x-axis to show the dates and get a larger time period. I'm even testing automating this script into some dashboard. Right now you can see a crappy jpg of the EURUSD currency pair on my labs site.

The generic code to build these support and resistance lines is here.


Functional Programming in Python

I'm spending time trying to understand the differences between writing classes and functions in Python. Which one is better and why? From what I'm gathering, a lot of people are tired of writing classes in general. Classes are used in Object Oriented Programming (OOP) and some python coders hate it because it's writing too many lines of code when only a few really matter. So programmers like functional programming (FP) in python instead.

To that end, I've been watching videos of both. OOP and FP videos on the Internet and started writing notes on them. Below is a great but also very deep video on functional progamming in python by Daniel Kirsch from PyData 2016. It's a great video and his presentation is about 30 minutes with a great Q&A session.


Functional Programming in Python

My notes from the above video are above are below:

  • First Class Functions
  • Higher Order Functions
  • Purity
  • Immutability (not going to talk about it)
  • Composition
  • Partial Application & Currying
  • Purity, a function without 'side effects'
  • First Class Functions, simply means that functions are like everybody else
  • Can define with 'def' or lambda
  • Can use the name of functions as variables and do higher-order programming
  • Decorators "… provide a simple syntax for calling higher-order functions. By definition, a decorator is a function that takes another function and extends the behavior of the latter function without explicitly modifying it."
  • Partial function applications - "The primary tool supplied by the Functools module is the class partial, which can be used to “wrap” a callable object with default arguments. Partial objects are similar to function objects with slight differences. Partial function application makes it easier to write and maintain the code."
  • Partial functions are very powerful
  • "Currying transforms a function that takes multiple arguments in such a way that it can be called as a chain of functions. Each with a single argument (Partial Application)." via Wikipedia
  • The important concept for Currying is closures, aka lexical scoping
  • Remembers the variables in the scope where it was defined
  • List comprehensions vs functional equivalents
  • Map function vs list comprehension
  • Filter function vs list comprehension
  • Reduce vs list comprehension
  • Why not write out the loop instead? Using Map/Filter/Reduce is cleaner
  • Function composition: i.e. run a filter and then map: map(f, filter(p, seq))
  • 'Import functools' is very useful
  • Main takeaways: Function Programming is possible in Python (to a degree)
  • Main takeaways: Small composable function are good
  • Main takeaways: FP == Build General Tools and Compose them
  • Python is missing: more list functions
  • Python is missing: Nicer lambda syntax
  • Python is missing: Automatic currying, composition syntax
  • Python is missing: ADTS (Sum Types)
  • Python is missing: Pattern Matching
  • Some remedies for list functions
  • Links provide in video @ 26:00
  • Suggest learning Haskell as a gateway to functional programming.


Changing Pinboard Tags with Python

Welcome to another automation post! This is a super simple Python script for changing misspelled or wrong tags in your Pinboard account. I started using Pinboard again because it helps me save all these great articles I read on the Interwebz, so I can paraphrase and regurgitate them back to you. Ha!

I need to clean out the Pinboard tags every so often because I hooked it up to Twitter. It works well for me because it saves all my retweets, favs and posts, but there's a lot of noise. Sometimes I end up with tags like "DataScience" and "DataScientists" when I really want "DataScience." I did some searching around and found the Pinboard Python library. Changing Pinboard tags with Python is EASY!

What you do need to do is install the Python package for Pinboard and get an API key from your Settings page. Then it's as simple as doing this:

Python Code

import pinboard

pb = pinboard.Pinboard('INSERT_YOUR_API_KEY_HERE')

old_tag = 'DataMining'
new_tag = 'DataScience'

pb.tags.rename(old=old_tag, new=new_tag)

You can, of course modify this script to pass command line arguments to it and just do something like this:

import pinboard
import sys

passcode = str(input('Enter your Pinboard API key here: '))

pb = pinboard.Pinboard(passcode)

old_tag = str(input('Enter the old tag: '))
new_tag = str(input('Enter the new tag: '))

pb.tags.rename(old=old_tag, new=new_tag)

print ('Converted: ' + old_tag+ ' to: ' + new_tag) 

Once again, the second script is all open source and free for you to use/modify as you see fit.

Note: I just regurgitated the original script (first one) and then riffed on it for the second one. The Author of Pinboard provided a sample in the documentation. Check that out too!