Searching through a file

When searching through data in a file, it’s very common to focus on specific lines that meet particular criteria. Using pattern for reading a file and string methods we can build methods to build simple search mechanisms.
For example, if we wanted to read a file and only print out lines which started the prefix “Python” we could use simple built in method called startswith to select only those lines with a desired prefix:
filename=open('P-files.txt')
for line in filename:
    if line.startswith('Python'):    
        print line
When program runs, we get the following output:
Python is a widely used high-level programming language used for general-purpose programming, created by Guido van Rossum and first released in 1991. An interpreted language, Python has a design philosophy which emphasizes code readability (notably using whitespace indentation to delimit code blocks rather than curly braces or keywords), and a syntax which allows programmers to express concepts in fewer lines of code than possible in languages such as C++ or Java.[24][25] The language provides constructs intended to enable writing clear programs on both a small and large scale.[26]

Python features a dynamic type system and automatic memory management and supports multiple programming paradigms, including object-oriented, imperative, functional programming, and procedural styles. It has a large and comprehensive standard library.[27]

Python is widely used and interpreters are available for many operating systems, allowing Python code to run on a wide variety of systems. CPython, the reference implementation of Python, is open source software[28] and has a community-based development model, as do nearly all of its variant implementations. CPython is managed by the non-profit Python Software Foundation.
The output is exactly what we’ve expected because we’ve searched for line that starts with Python. Each of the line ends with a new line, so the print statement prints the string in the variable line which includes a newline and then print adds another newline, resulting in the double spacing effect we see.
We could use line slicing to print all but the last character, but a simpler approach is to use the rstrip method which strips whitespace from the right side of a string as follows:
filename=open('P-files.txt')
for line in filename:
    line = line.rstrip()
    if line.startswith('Python'):
        print line
As your programs become more complex, it’s a good idea to structure your search loops using continue statement. The idea is to skip uninteresting lines and only focus on interesting lines. And when we find interesting line, we perform some operation on interesting lines.
We can structure the loop to follow the pattern of skipping uninteresting lines as follows:
for line in filename:
    line = line.rstrip()
    if not line.startswith('Python:'):
        continue
    print line
The output of the program is the same and those lines that do not start with ‘Python ‘ we skip using continue statement. For interesting lines we use built in function called print and print out interesting lines. To find interesting lines we can use find string method to simulate a text editor search with finds lines where the search string is anywhere in the line. Since find looks for an occurrence of a string within another string and either returns the position of the string or -1 if the string was not found.
filename=open('P-files.txt')
for line in filename:
    line = line.rstrip()
    if line.find('CPython') == -1:
        continue 
    print line
The result is given below:

Python is widely used and interpreters are available for many operating systems, allowing Python code to run on a wide variety of systems. CPython, the reference implementation of Python, is open source software[28] and has a community-based development model, as do nearly all of its variant implementations. CPython is managed by the non-profit Python Software Foundation.

Nema komentara:

Objavi komentar