Build a Google Drive Organizer with Claude Code
Build a Google Drive Organizer with Claude Code
What you'll build
A Python tool that maps your entire Google Drive, shows you exactly how messy it is, then organizes everything into clean folders by type and year.
What You're Building
Your Google Drive is a mess. Thousands of files with no structure — documents mixed with photos, invoices next to travel journals, meeting recordings buried under old homework. You've been meaning to organize it for years, but who has the time?
Today you'll build a tool that does it for you. In 5 steps, you'll go from a chaotic Drive to a clean folder structure — and learn how APIs, file classification, and pattern matching work along the way.
Before You Start
- Python 3.8+ (
python3 --versionto check) - A Google Cloud project with the Drive API enabled
- A
credentials.jsonfile (Google Cloud Console > APIs & Services > Credentials > Create OAuth 2.0 Client ID > Download JSON) - Install dependencies:
pip install google-auth google-auth-oauthlib google-api-python-client
Set up your project folder:
mkdir drive-organizer && cd drive-organizer
Drop your credentials.json into this folder.
Milestone 1: Connect to Google Drive
We need to talk to Google's API. That means authentication.
Prompt:
Create a Python script called map_drive.py that authenticates with Google Drive using OAuth2. It should look for a credentials.json file in the same directory, handle token refresh, and print "Connected to Google Drive!" when successful. Use the google-auth and google-api-python-client libraries.
What Claude Code does: It generates the full OAuth2 flow — credentials file, token caching, and automatic refresh. The first time you run it, a browser window opens for Google sign-in. After that, it uses a saved token so you don't have to sign in again.
Try it: python map_drive.py
A browser opens, you authorize, and you see "Connected to Google Drive!" You just talked to Google's API from your terminal.
Milestone 2: Map Your Entire Drive
Now let's see what we're working with.
Prompt:
Extend map_drive.py to scan my entire Google Drive. Fetch all folders and all files using the files.list API. For each folder, track its name, parent folder ID, and children. For each file, use its parent folder ID to determine its location and classify its type (image, video, document, spreadsheet, presentation, PDF, audio, archive) based on MIME type. Count how many files are sitting in root with no folder. Print a summary: total folders, total files, files in root, and the top 10 largest folders by file count.
What Claude Code does: It adds paginated API scanning — Google Drive returns files in batches of 1000, so the script loops through page tokens until every file is fetched. It builds a folder tree by tracking parent-child relationships, then maps each file to its folder.
Try it: python map_drive.py
Watch the progress counter tick up. When it finishes, you'll see a summary with your total file count, how many folders you have, and — the scary number — how many files are just floating in root with no home.
Milestone 3: Generate a Structure Report
Numbers are useful, but we need to see the actual structure.
Prompt:
Add a report feature to map_drive.py. After scanning, save a human-readable report to drive_report.txt showing: all files in root (count and type breakdown), each top-level folder with its file count, year range, and sample file names, and the 20 largest folders. Also save the raw scan data as JSON to drive_map.json for later use. Use recursive functions to calculate total file counts and year ranges through nested subfolders.
What Claude Code does: It builds recursive tree traversal — functions that walk up the parent chain to build full paths, and down through children to compute total sizes. The report turns raw API data into a readable map of your Drive.
Try it: python map_drive.py then open drive_report.txt.
Your entire Drive, mapped out. Every top-level folder, what's inside, how big it is, and how many orphan files are floating in root. This is your "before" snapshot.
Milestone 4: Organize by Type and Year
Time to fix it. The simplest useful organization: sort files by what they are and when they were created.
Prompt:
Create a new script called organize_drive.py. It should authenticate with Google Drive using the same auth pattern, scan all files, and organize them into an "Organized" folder with subfolders by type and year — like Organized/Pictures/2023, Organized/Documents/2024. Map MIME types to categories: images go to Pictures, videos to Videos, Google Docs and Word files to Documents, spreadsheets to Spreadsheets, PDFs to PDFs, audio to Audio. Add a --dry-run flag that shows what would be moved without actually moving anything. Always show a confirmation prompt before executing moves.
What Claude Code does: It generates the organizer with dry-run mode — the most important pattern when building tools that modify data. It uses the Drive API's addParents/removeParents update to move files between folders without copying. The plan shows you exactly what will happen before a single file moves.
Try it: python organize_drive.py --dry-run
You'll see the full plan: every file category, every year bucket, with counts and sample names. When you're ready, run python organize_drive.py and confirm. Watch your files sort themselves.
Milestone 5: Smart Classification with Pattern Matching
Type + Year works for photos and videos, but 2,000 documents in "Documents/2023" isn't very helpful. We need to understand what files are about.
Prompt:
Create a script called deep_scan.py that classifies files by analyzing their names and folder paths with case-insensitive regex patterns. Define categories: business documents (keywords like "invoice", "proposal", "contract", "budget"), personal writing ("journal", "diary", "recap", "travel"), meetings ("meeting", "notes", "recording", "agenda"), marketing ("campaign", "newsletter", "social media", "funnel"), and education ("course", "lesson", "homework", "exam"). For each file, check its name and path against all patterns and assign matching tags. Generate a report grouped by category and time period (pre-2020, 2020-2022, 2023+). List uncategorized files separately. Save to scattered_files_report.txt.
What Claude Code does: It builds a pattern-based classifier — each category has a list of regex patterns, and every file gets checked against all of them. The key insight: semantic classification beats mechanical sorting. A file named "Q3 Budget Review" isn't just a "document" — it's a finance file, and knowing that is far more useful than knowing it's a Google Doc from 2024.
Try it: python deep_scan.py
The report shows your Drive organized by meaning: business files, personal writing, meeting notes, marketing material — all identified automatically. The "uncategorized" section shows what still needs human judgment. You just taught your script to read.
What You Built
A three-part toolkit:
- map_drive.py — scans and reports your entire Drive structure
- organize_drive.py — sorts files into Type > Year folders with dry-run safety
- deep_scan.py — classifies files by content using pattern matching
The big lesson: start with a map, then sort mechanically, then sort semantically. Mechanical sorting (by type) handles media. Semantic sorting (by meaning) handles documents. Together they cover everything.
Take It Further
- Add a cleanup script that finds and trashes junk folders (old node_modules, empty files)
- Build a final consolidation step that reduces your root to 5-6 top-level folders
- Add support for bilingual file names if you work in multiple languages
- Schedule the mapper to run weekly and alert you when clutter accumulates
Ready to build your first AI agent?
Live Zoom workshop + 1 month WhatsApp follow-up with Yuval Keshtcher (Hebrew)
Learn about the Workshop