Tutorial

Using Python to prove a Friends trivia


This is a beginner level tutorial to practice coding in Python. Prove a trivia of the famous sitcom, Friends using simple pattern recognition and basic scripting in Python. You will also get familiar with some built in modules in python.


Introduction


Friends is a popular American sitcom. This tutorial is related to a trivia associated with this sitcom.

“The word ‘friends’ appears at least once in every episode of friends”.

The question is can we prove this trivia using programming. Well as a matter of fact, we will prove/disprove it in this tutorial.

1. Download the subtitles text files of each episode of each season Friends.

2. Run a loop to parse each episode’s subtitle file.

3. Search for the word “friends” in the subtitle file.


Step 1: Write a method which downloads a file from a url


First create a new folder in your workspace folder. Name it ‘Friends Trivia’. In that folder, create a new file friends.py. We will write our whole code in that file and then run it.

Python has many ways to interact with network and download files over http, with its default library set. We will use urllib module to download a file from a url. Add the following method to friends.py :

import urllib

def downloadSeason(downloadUrl, fileName):

urllib.urlretrieve(downloadUrl, fileName)


Step 2: Write a method to extract a zip file


After download, we need to extract the zip file. We will use the zipfile module to accomplish this task. Again, add the following method to your code:

import zipfile

def extractZip(seasonNo):

fileName = str(seasonNo) + '.zip'

z = zipfile.ZipFile(fileName)

z.extractall(str(seasonNo))


Step 3: Download and extract the subtitle files of Friends


The subtitle files of Friends can be found season wise here. We can go to each season’s page and manually click the download button for each season. Or we can do it the cooler way, using Python scripting. If we observe carefully, the zip file of subtitles for each season has the following pattern :

http://www.tvsubtitles.net/files/seasons/Friends%20-%20season%201.en.zip for Season 1

http://www.tvsubtitles.net/files/seasons/Friends%20-%20season%208.en.zip for Season 8

http://www.tvsubtitles.net/files/seasons/Friends%20-%20season%20x.en.zip for Season x

We get an intuition that we can in fact apply a simple loop to download this file. In friends.py, add the following code :

for i in range(1,10):

url = "http://www.tvsubtitles.net/files/seasons/Friends%20-%20season%20"

seasonUrl = url + str(i) + ".en.zip"

fileName = str(i) + ".zip"

downloadSeason(seasonUrl, fileName)

extractZip(i)

This will download all the zip files and extract them to folders named after season numbers.


Step 4: Check each file for the occurence of word 'Friends'


Now that we have sucessfully downloaded all the subtitle files of friends, let us proceed to check the file contents for our word. There are 10 folders created, each corresponding to a season number. Each folder contains subtitles of all the episodes of that particular season. So we need to iterate over all files in a folder. We use the build in os module for this as shown.

import os

for i in range(1,11):

for subfile in os.listdir(str(i)):

f = open(str(i) + "/" + subfile)

filecontents = f.read()

if (filecontents.find("Friends") == -1 and filecontents.find("friends") == -1 and filecontents.find("FRIENDS") == -1 ) :

print 'Season ' + str(i) + ", " + subfile

Inside the loop, we are reading the file contents with the File::read method and checking the occurence of the word "friends" in the file using the find method. This method is case sensitive so we put all case conditions of the word. If there is no occurence, we print the corresponding file name.


Step 5: Execute the code and observe the output


Finally, open your favourite Terminal application and execute the python script.

The final code looks like this:

friends.py

#Run "python friends.py" in the Terminal

import urllib import zipfile import os

def downloadSeason(downloadUrl, fileName):

urllib.urlretrieve(downloadUrl, fileName)

def extractZip(seasonNo):

fileName = str(seasonNo) + '.zip'

z = zipfile.ZipFile(fileName)

z.extractall(str(seasonNo))

for i in range(1,11):

url = "http://www.tvsubtitles.net/files/seasons/Friends%20-%20season%20"

seasonUrl = url + str(i) + ".en.zip"

fileName = str(i) + ".zip"

downloadSeason(seasonUrl, fileName)

extractZip(i)

for i in range(1,11):

for subfile in os.listdir(str(i)):

f = open(str(i) + "/" + subfile)

filecontents = f.read()

if (filecontents.find("Friends") == -1 and filecontents.find("friends") == -1 and filecontents.find("FRIENDS") == -1 ) :

print 'Season ' + str(i) + ", " + subfile

Try running this code on your machine. You will be surprised with the result!

Rohan Raja

Recently graduated, majoring in Mathematics and Computing from IIT Kharagpur, Rohan is a technology enthusiast and passionate programmer. Likes to apply Mathematics and Artificial Intelligence to devise creative solutions to common problems.