My mentor teacher asked me to make a list containing the topic of each question on the New York State Regents exam from August 2010 and January 2011 (when the test is released to the public). Such a list can obviously be very helpful for curriculum planning in the state of New York. The exams themselves are readily available from this website, and they already include the topic of each question listed within the file. It’s just a massive copy and paste job to extract the problem topics.
As many of you might know, I would rather spend a few hours writing a program that extracts the question topics automatically, then to go into each file and manually copy and paste the 39 topics into an Excel spreadsheet.
So, I wrote a Python script that takes a publicly-released math Regents test, such as this:
and generates a text file with the question topic on each line, such as this:
Here’s a step-by-step tutorial for how to use the Python script. If you are already familiar with Python, skip to step #3 for the source code. Note that I program in Windows, so these directions don’t apply to MAC users. Sorry.
1. Download Python for Windows from here. Choose “Windows x86 MSI installer” as the one to download.
2. To be safe, the Python executables (that are by default installed to C:\Python31) should be in the “Window’s Path.” This allows Windows to know where the executables are located. To do this, right click on “My Computer,” click “Properties,” choose the “Advanced” tab, click “Environmental Variables.” Under system variables choose the one called “Path” and click “Edit.” Find the input box labeled “Variable Name.” This box should contain a list of folder locations. Go the the end of the list and add “; C:\Python32”. Click OK.
3. Hover over the code posted below with your mouse and select “copy to clipboard” in the top right hand corner. Open up Word Pad (Start > All Programs > Accessories > WordPad) and paste the code into a new file. Click Save As. Save the file as “extractRegentsTopics.py”.
import os import sys fileName = input('Please enter the file name of the Regents exam from which you are extracting topics.\n') if not os.path.exists(fileName): sys.exit('Could not find file.') logfile = open(fileName, "r").readlines() KEYWORDS = ['TOP:'] topics =  counterline =  counter = 0 for line in logfile: recordBool = 0 topicLine = '' for word in line.split(): if recordBool == 1: topicLine += word topicLine += ' ' counter+=1 if word in KEYWORDS: counterline.append(counter) recordBool = 1 if len(topicLine) > 1: l = topicLine[0:len(topicLine)-1] topics.append(l) topicFileName = fileName[0:len(fileName)-4] + '_topics.txt' outFile = open(topicFileName, 'w') for topic in topics: outFile.write(topic) outFile.write("\n") print(topics) outFile.close()
4. Download one of the Word DOC Regents exams from this website.
5. The python script cannot deal with .doc format, so you have to save it as a raw text file. Open the Regents exam in Microsoft Word and choose “Save as.” When you save it, choose to save it as a “plain text” (txt) file. To do this, you might have to click on “Other Formats” depending on what version of Word you are using.
6. Open Windows Explorer and navigate to the folder in which you saved extractRegentsTopics.py. Double click on the file.
7. When prompted by the program, write the full file name (including folder location) of the Regents exam text faile that you created in step #4, e.g. C:\Users\carlberg\Documents809ExamIA.txt
8. In the same folder that the Regents exam is saved, a new file will be created with the following file name XXXXX_topics.txt, e.g. C:\Users\carlberg\Documents809ExamIA_topics.txt
9. Phew! That was complicated. But, now you have a file with the list of Regents exam question topics in the same folder as the exam! Now it’s time to play in Excel. Yay!
Let me know if you find this useful, or if you need some more help to get this working.