Files

In computing, a file is a fundamental unit of digital storage used to hold information. This information can be anything from text and images to complex data structures. Files are used by programs to read from and write data to storage devices.

Key Concepts:

  1. File Extensions:

    • A file extension (or title extension) is a suffix at the end of a file name, following a period (e.g., .txt, .jpg, .exe).
    • It helps the operating system determine the file type and the associated application for opening it. For example, .docx files are typically opened with Microsoft Word.
  2. File Formats:

    • The file extension often reflects the file format, which describes how data is organized within the file. For instance, .csv files contain comma-separated values, while .mp3 files are used for audio.
    • Renaming a file extension doesn't change its format or content. For instance, changing a .csv file to .mp3 doesn't convert the file's data into audio.
  3. Executable Files: Some files are executable, meaning they can perform actions by themselves, such as running a program or script, without needing another application to open them. Examples include .exe files on Windows.

  4. Directories and Files:

    • Files store data and can vary greatly in size, from a few bytes to several gigabytes.
    • Directories (or folders) organize files and other directories, helping manage and structure data on a computer. They themselves use minimal storage space.

In Python, working with files typically involves using built-in functions and methods to open, read, write, and close files.

Types of Files

Python supports two main types of files: text files and binary files. Each type encodes data differently, affecting how the data is stored and interpreted.

Text Files

Text files contain data encoded as characters, using standard encoding schemes like ASCII or UTF-8. These files are often human-readable and include various formats, such as:

  • Plain Text (.txt)
  • Markup Languages (.html, .xml)
  • Source Code (.py, .java)
  • Configuration Files (.ini, .cfg)

Characteristics:

  • Readability: They can be opened and read by text editors (e.g., Notepad, TextEdit).
  • Structure: Text files contain lines of text separated by End-of-Line (EOL) characters and an End-of-File (EOF) marker at the end.
  • Encoding: They use a character encoding scheme to interpret and display characters correctly.

Binary Files

Binary files store data in a format that is not directly readable by humans. They contain bytes that can represent anything, from executable code to media files. Common types include:

  • Images (.jpg, .png)
  • Videos (.mp4, .avi)
  • Audio (.mp3, .wav)
  • Documents (.pdf, .docx)
  • Archives (.zip, .rar)

Characteristics:

  • Non-Readability: They often appear as garbled text in a text editor but can be interpreted by specific applications designed to handle the file type.
  • Headers: Binary files may include headers that provide metadata about the file’s format and contents.
  • Complex Data: They can contain structured data like multimedia, which requires appropriate software to interpret.

File Extensions

Extensions often indicate the type of file:

  • Text Files: .txt, .csv, .md
  • Binary Files: .jpg, .exe, .pdf

File Paths

Files are accessed via paths, which can be:

  • Fully Qualified Paths: Complete paths from the root directory (e.g., C:\path\to\file.txt or /path/to/file.txt).
  • Relative Paths: Paths relative to the current directory (e.g., ..\file.txt).

Path Examples:

  • Fully Qualified: C:\docs\example.txt (Windows) or /home/user/docs/example.txt (Linux)
  • Relative: ..\example.txt (up one directory) or ./example.txt (current directory)

Creating and Reading Data in Python

In Python, file handling is essential for saving and retrieving data that persists beyond the program's execution. This involves creating, reading, writing, and closing files.

Opening and Creating Files

To work with files, you use the built-in open() function. This function returns a file object, which provides methods and attributes to interact with the file.

Syntax:

file_handler = open(title, mode)
  • title: The name of the file you want to open or create.
  • mode: Specifies the mode in which the file is opened. If omitted, the default mode is 'r' (read-only).

Common Modes:

ModeDescription
'r'Read-only mode (default mode).
'w'Write mode (creates a new file or truncates an existing file).
'a'Append mode (adds data to the end of the file or creates a new file).
'r+'Read and write mode (file must exist).
'w+'Read and write mode (creates a new file or truncates an existing file).
'a+'Read and append mode (adds data to the end of the file or creates a new file).
'x'Exclusive creation (creates a new file but raises an error if the file already exists).

Examples:

file_handler = open("eniv.txt", "x")  # Creates a new file
file_handler = open("sun.txt", "r")     # Opens an existing file for reading
file_handler = open("data.txt", "w+")    # Opens a file for reading and writing, creates if not present

Writing Data to Files

You can write to a file using the write() or writelines() methods:

  • Writing Strings:

    file_handler.write("Hello, Students!")
  • Writing Multiple Lines:

    lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
    file_handler.writelines(lines)

Reading Data from Files

You can read from files using various methods:

  • Reading Entire File:

    content = file_handler.read()
  • Reading Line by Line:

    for line in file_handler:
        print(line, end="")
  • Reading Specific Number of Characters:

    partial_content = file_handler.read(100)  # Reads the first 100 characters

Closing Files

It is crucial to close a file when done to free up system resources:

file_handler.close()

Using with Statement

To simplify file handling and ensure files are properly closed even if an error occurs, use the with statement:

with open("example.txt", "r") as file_handler:
    content = file_handler.read()
    print(content)

The with statement automatically closes the file when the block is exited.

Example Code

files.py
# Writing to a file
with open("eniv.txt", "w") as file_handler:
    file_handler.write("Hello, Students!\n")
    lines = ["Welcome to Python programming.\n", "This is an example of file handling.\n"]
    file_handler.writelines(lines)
 
# Reading from the file
with open("eniv.txt", "r") as file_handler:
    # Read the entire file
    content = file_handler.read()
    print("Reading entire file:")
    print(content)
 
    # Reset file pointer to the beginning
    file_handler.seek(0)
 
    # Read line by line
    print("\nReading line by line:")
    for line in file_handler:
        print(line, end="")  # end="" prevents extra newlines
 
    # Reset file pointer again
    file_handler.seek(0)
 
    # Read a specific number of characters
    partial_content = file_handler.read(25)
    print("\n\nReading first 25 characters:")
    print(partial_content)

Output:

Reading entire file:
Hello, Students!
Welcome to Python programming.
This is an example of file handling.
 
 
Reading line by line:
Hello, Students!
Welcome to Python programming.
This is an example of file handling.
 
 
Reading first 25 characters:
Hello, Students!
Welcome

Handling File Paths

file_handler = open("C:\\path\\to\\file.txt", "r")  # Absolute path with escaped backslashes
file_handler = open(r"C:\path\to\file.txt", "r")   # Raw string to avoid escape sequences

File Object Attributes

When working with file objects, you can retrieve various attributes:

AttributeDescription
nameReturns the name of the file.
closedReturns True if the file is closed, False otherwise.
modeReturns the mode in which the file was opened.

Example:

file_handler = open("eniv.txt", "w")
print(f"File Name is {file_handler.name}")
print(f"File State is {file_handler.closed}")
print(f"File Opening Mode is {file_handler.mode}")
file_handler.close()

Output:

File Name is eniv.txt
File State is False
File Opening Mode is w

File Methods to Read and Write Data in Python

Python provides several methods to read from and write to files. Understanding these methods allows you to efficiently manage file data. Below are the key methods associated with file objects in Python.

1. Writing to a File

MethodSyntaxDescription
write()file_handler.write(string)Writes the contents of string to the file. Returns the number of characters written.
writelines()file_handler.writelines(sequence)Writes a sequence of strings to the file.

2. Reading from a File

MethodSyntaxDescription
read()file_handler.read([size])Reads the contents up to size bytes. If size is omitted, reads the entire file.
readline()file_handler.readline()Reads a single line from the file.
readlines()file_handler.readlines()Reads all lines in the file and returns them as a list.

Example Code

def write_and_read_file(title):
    with open(title, 'w') as f:
        f.write("Greetings from Eniv!\n")
        f.writelines(["This is the second line.\n", "And here's the third.\n"])
 
    with open(title, 'r') as f:
        print(f.read())  # Read entire file
        f.seek(0)
        print(f.readline())  # Read one line
        f.seek(0)
        print(f.readlines())  # Read all lines into a list
        f.seek(0)
        print(f.read(10))  # Read first 10 characters
 
# Example usage
write_and_read_file('eniv.txt')

Output

Greetings from Eniv!
This is the second line.
And here's the third.
 
Greetings from Eniv!
 
['Greetings from Eniv!\n', 'This is the second line.\n', "And here's the third.\n"]
Greetings

3. File Position and Seeking

MethodSyntaxDescription
tell()file_handler.tell()Returns the current position in the file, measured in bytes from the beginning.
seek()file_handler.seek(offset, from_what)Moves the file pointer to a specific position. from_what specifies the reference point (0 = beginning, 1 = current position, 2 = end).
# Getting and setting file pointer position
with open("eniv.txt", "r") as f:
    print(f.tell())  # Prints current position (0 initially)
    f.read(5)
    print(f.tell())  # Prints position after reading 5 bytes
    f.seek(0)        # Moves pointer back to the beginning
    print(f.tell())  # Prints 0, as pointer is at the beginning

Make sure the file “eniv.txt” exists in the directory before running the program.

Output

0
5
0

Reading and Writing Binary Files in Python

Binary files store data in a format that is not human-readable and is usually specific to the application that created the file. Unlike text files, which can be read and written with text encoding, binary files are handled as bytes. Python provides the ability to work with binary files using the 'b' mode in the open() function.

1. Opening and Writing Binary Files

When opening a file for binary operations, append 'b' to the mode argument of the open() function. This will ensure that the file is treated as binary, and the data will be read or written as bytes objects without decoding.

Example:

# Writing binary data to a file
with open("example.bin", "wb") as binary_file:
    binary_file.write(b'\xDE\xAD\xBE\xEF')  # Write raw bytes

2. Reading Binary Files

To read from a binary file, open the file in 'rb' mode. You can then read the contents as bytes objects. This is particularly useful when dealing with non-text files like images or executables.

Example:

# Reading binary data from a file
with open("example.bin", "rb") as binary_file:
    data = binary_file.read()
    print(data)  # Output: b'\xde\xad\xbe\xef'

3. Example Programs

Example 1: Creating a New Image from an Existing Image

This example demonstrates how to copy a binary file (like an image) from one file to another.

def main():
    with open("eniv.jpg", "rb") as existing_image, open("new_eniv.jpg", "wb") as new_image:
        for each_line_bytes in existing_image:
            new_image.write(each_line_bytes)
 
if __name__ == "__main__":
    main()

In this program:

  • eniv.jpg is opened in binary read mode ("rb").
  • new_eniv.jpg is opened in binary write mode ("wb").
  • The file is read line-by-line as bytes and written to the new file.

Example 2: Reading and Printing Each Byte in a Binary File

This example writes a byte string to a file and then reads and prints each byte.

def main():
    with open("workfile", "wb") as f:
        f.write(b'abcdef')  # Write byte data
 
    with open("workfile", "rb") as f:
        byte = f.read(1)  # Read the first byte
        print("Print each byte in the file")
        while byte:
            print(byte)  # Output each byte
            byte = f.read(1)  # Read the next byte
 
if __name__ == "__main__":
    main()

Output:

Print each byte in the file
b'a'
b'b'
b'c'
b'd'
b'e'
b'f'

In this program:

  • workfile is opened in binary write mode ("wb") and written with the byte string b'abcdef'.
  • It is then reopened in binary read mode ("rb"), and each byte is read and printed until the end of the file.

4. Working with Bytes Objects

In Python, bytes literals and objects are handled differently from text strings. Here are some operations with bytes:

# Bytes literals and objects
print(b'Hello')  # Output: b'Hello'
print(type(b'Hello'))  # Output: <class 'bytes'>
 
# Create bytes objects
print(bytes(3))  # Output: b'\x00\x00\x00'
print(bytes([70]))  # Output: b'F'
print(bytes([72, 101, 108, 108, 111]))  # Output: b'Hello'
print(bytes('Hi', 'utf-8'))  # Output: b'Hi'
 
# Bytes object from a string with encoding
text = "Hello"
encoded_bytes = bytes(text, 'utf-8')
print(encoded_bytes)  # Output: b'Hello'

Explanation:

  • b'Hello' creates a bytes literal.
  • bytes() can create a bytes object from an integer or a sequence of integers in the range 0-255.
  • When converting a string to bytes, specify the encoding (e.g., 'utf-8').

Reading and Writing CSV Files in Python

CSV (Comma Separated Values) is a common format for storing tabular data. Each line in a CSV file represents a row of data, with fields separated by commas. This format is both human-readable and widely supported by various applications.

Python's built-in csv module provides functionalities to read from and write to CSV files efficiently.

1. Writing CSV Files

To write data to a CSV file, use the csv.writer function. This method returns a writer object that converts user data into comma-separated values and writes them to the file.

Example:

import csv
 
# Writing to a CSV file
with open("contacts.csv", "w", newline='') as file:
    csv_writer = csv.writer(file)
    csv_writer.writerow(["Name", "Email", "Phone"])  # Write header
    csv_writer.writerow(["Eniv", "eniv@example.com", "123-456-7890"])
    csv_writer.writerow(["Vine", "vine@example.com", "987-654-3210"])

In this example:

  • "contacts.csv" is opened in write mode.
  • csv.writer writes rows of data to the file, including a header row.

2. Reading CSV Files

To read data from a CSV file, use the csv.reader function. This method returns a reader object that iterates over lines in the CSV file, converting each line into a list of strings.

Example:

import csv
 
# Reading from a CSV file
with open("contacts.csv", "r") as file:
    csv_reader = csv.reader(file)
    for row in csv_reader:
        print(row)  # Each row is a list of values

Output

['Name', 'Email', 'Phone']
['Eniv', 'eniv@example.com', '123-456-7890']
['Vine', 'vine@example.com', '987-654-3210']

In this example:

  • "contacts.csv" is opened in read mode.
  • csv.reader reads each line of the file, returning a list of values for each row.

3. Writing CSV Files with Dictionaries

To write rows as dictionaries, use csv.DictWriter. This method allows you to write dictionary objects to the CSV file, with fields corresponding to the specified field names.

Example:

import csv
 
# Writing dictionaries to a CSV file
with open("contacts.csv", "w", newline='') as file:
    fieldnames = ["Name", "Email", "Phone"]
    csv_writer = csv.DictWriter(file, fieldnames=fieldnames)
 
    csv_writer.writeheader()  # Write header
    csv_writer.writerow({"Name": "Eniv", "Email": "eniv@example.com", "Phone": "1234567890"})
    csv_writer.writerow({"Name": "Vine", "Email": "vine@example.com", "Phone": "9876543210"})

In this example:

  • "contacts.csv" is opened in write mode.
  • csv.DictWriter writes rows where each row is a dictionary with keys matching the field names.

4. Reading CSV Files with Dictionaries

For working with CSV files where rows are represented as dictionaries, use csv.DictReader. This method reads the CSV file into dictionaries, where the keys are derived from the header row.

Example:

import csv
 
# Reading CSV into dictionaries
with open("contacts.csv", "r") as file:
    csv_reader = csv.DictReader(file)
    for row in csv_reader:
        print(row)  # Each row is an OrderedDict (Python 3.6+) or a dictionary (Python 3.6+)

Output

{'Name': 'Eniv', 'Email': 'eniv@example.com', 'Phone': '123-456-7890'}
{'Name': 'Vine', 'Email': 'vine@example.com', 'Phone': '987-654-3210'}

Key Points:

  • csv.reader: Reads CSV data into lists of strings.
  • csv.writer: Writes data as comma-separated values.
  • csv.DictReader: Reads CSV data into dictionaries using header row for keys.
  • csv.DictWriter: Writes dictionaries to a CSV file, using the specified fieldnames for order and headers.
How's article quality?

Last updated on -

Page Contents