Mostly Linux & Python syntax notes and hyperlinks.

Thursday, September 10, 2009

starting python: file I/O, strip(), [start:end]

Janak left me with a python script to process a directory full of files. He said I'd have to modify it a little. He went home.

First thing to look up: To add a comment, put a "#" at the start of the line.
Useful for note-taking in the code, and commenting out the complex stuff while I try to get the easy stuff working.

Well, I have a file I'm writing to:
fd=open("list.txt","w+")
fd.write(var+' is the value of var.\n')
And a file I'm reading from :
for line in open(from_dir + "\\" +in_file)
That is cool to have the read-line from the file in the same step as the opening of the file, and all in a for loop while the lines in the file don't run out, all in one step. And what that syntax does is obvious even to a novice.

I'm not sure about the "\\". This is in Windows, but why two backslashes? Is the first \ to escape the second one? In Linux, would it be "\/" or "/"? Is there a unix/windows-neutral version?

Next, he keeps using this strip() function
line=line.strip()
Here's the official syntax: http://docs.python.org/library/string.html
In a tutorial: http://www.tutorialspoint.com/python/string_strip.htm
If you don't put anything in the brackets, it strips the whitespace from the start and end of the string.

He also uses square brackets to pull out substrings from our lines of data.
The negative value in the range is hard to figure out, but it means "start counting from the end".
If you get a python prompt, you can experiment with square bracket syntax.
It's good to see that it doesn't freak if you try to pull out an impossible range. It just returns an empty string.
>>> s1="abcde.fgh"
>>> s1
'abcde.fgh'
>>> s1[0:2]
'ab'
>>> s1[0:-4]
'abcde'
>>> s1[2:3]
'c'
>>> s1[2:4]
'cd'
>>> s1[-4:-2]
'.f'
>>> s1[-2:-4]
''
>>> s1[5:-3]
'.'
If you're testing the start or end of a string, don't use ranges.
Use startswith and endswith instead:
if str.endswith(".txt"):
if str.startswith("my_files_"):
You'll avoid goofs from miscounting characters, and it's also easier to read and understand.

No comments:

Post a Comment