Flask Tutorial: Upload CSV file and Insert rows into MySQL Database


Python is a cool language, soon as you finish your first projects with it, you will be hocked.

It's a general purpose language and it's gaining popularity for the recent years among data scientists and data engineers.

Personally, I recommend Python to newbies especially teens, not just because it has a simple direct learning curve, but also because it has a large community as well as a rich ecosystem.

About this tutorial

I was helping a friend through how to solve problems by dissecting them into small tiny problems and solve them one-by-one, So here is the code assignment as the main problem we are trying to solve.

The Problem:
Use Python to upload and import CSV file into a pre-defined table in MySQL database.

Steps

  1. Create file upload form
  2. Upload the CSV using Flask
  3. Parse CSV file data
  4. Connect to the database
  5. Insert rows into a specific table in the database

Stack we will use:

  1. Flask;
    A Python Minimal (Micro) web frameworks.
  2. MySQL;
    If you use Windows you can use WAMP or XAMPP, if you are using Linux XAMPP is a good choice. If you are using macOS, XAMPP or MAMP are good solutions.
  3. Pandas;
    Pandas is an open-source python library for data analytics and manipulation. It's widely used among data scientists and data engineers.
  4. mysql.connector;
    The library that we will use to connect to the MySQL database and insert the data.

Requirements

  • Install Flask;
    Here I will install Python directly with pip, but I strongly recommend using python-virtualenv for best practice.
pip install Flask
  • Install XAMPP;
    The XAMPP package will include MySQL database and PHPMyAdmin to manage our MySQL databases.

  • Install the required libraries (Pandas, sql.connector)
pip install pandas
pip install sql.connector

  • A sample CSV file, I will be using addresses.csv from here.

0 Get Flask ready

Now let's test if Flask works by starting to create our main file main.py

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
     return "Flask CSV Filer Uploader and Parser"

if (__name__ == "__main__"):
     app.run(port = 5000)

We can run the file by $ python main.py then heading tohttp://127.0.0.1:5000/.

If everything goes as planned, you will get the rendered message.

Next step is to create the HTML page which will be responsible for uploading the CSV files.

1- Create file upload form

Let's create templates directory and our HTML upload form in it.

<!doctype html>
<html>
  <head>
    <title>FLASK CSV File Upload</title>
  </head>
  <body>
    <h1>Upload your CSV file</h1>
    <form method="POST" action="" enctype="multipart/form-data">
      <p><input type="file" name="file"></p>
      <p><input type="submit" value="Submit"></p>
    </form>
  </body>
</html>

2- Upload the files to the server

Now we need to accept the incoming files from this form, let's add the reasonable function in main.py

from flask import Flask, render_template, request, redirect, url_for
import os
from os.path import join, dirname, realpath

app = Flask(__name__)

# enable debugging mode
app.config["DEBUG"] = True

# Upload folder
UPLOAD_FOLDER = 'static/files'
app.config['UPLOAD_FOLDER'] =  UPLOAD_FOLDER


# Root URL
@app.route('/')
def index():
     # Set The upload HTML template '\templates\index.html'
    return render_template('index.html')


# Get the uploaded files
@app.route("/", methods=['POST'])
def uploadFiles():
      # get the uploaded file
      uploaded_file = request.files['file']
      if uploaded_file.filename != '':
           file_path = os.path.join(app.config['UPLOAD_FOLDER'], uploaded_file.filename)
          # set the file path
           uploaded_file.save(file_path)
          # save the file
      return redirect(url_for('index'))

if (__name__ == "__main__"):
     app.run(port = 5000)

As everything is set, we can test the file upload now at http://127.0.0.1:5000/ If everything is ok, you can see the uploaded files at /static/files/

3- Parse CSV file

To parse the CSV file data into rows, we will use pandas the data analysis and manipulation library.

  • Import the library
import pandas as pd

Parse CSV function



def parseCSV(filePath):
      # CVS Column Names
      col_names = ['first_name','last_name','address', 'street', 'state' , 'zip']
      # Use Pandas to parse the CSV file
      csvData = pd.read_csv(filePath,names=col_names, header=None)
      # Loop through the Rows
      for i,row in csvData.iterrows():
             print(i,row['first_name'],row['last_name'],row['address'],row['street'],row['state'],row['zip'],)

Don't forget to add parseCSV(file_path) function soon after the file is saved in uploadFiles

# Output 
0 John Doe 120 jefferson st. Riverside  NJ 8075
1 Jack McGinnis 220 hobo Av. Phila  PA 9119
2 John "Da Man" Repici 120 Jefferson St. Riverside  NJ 8075
3 Stephen Tyler 7452 Terrace "At the Plaza" road SomeTown SD 91234
4 nan Blankman nan SomeTown  SD 298
5 Joan "the bone", Anne Jet 9th, at Terrace plc Desert City CO 123

4- Connect to the database (XAMPP/ MySQL)

  • In this step we will connect to the database and make sure it works.
  • Add import mysql.connector
  • Connect to the MySQL Server
import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="root",
  password="",
  database="databaseName"
)

Make sure to replace the conditionals right.

  • To test the Database Connection, let's list all available databases:
mycursor = mydb.cursor()

mycursor.execute("SHOW DATABASES")

# List All Databases
for x in mycursor:
  print(x)

Create the database and the table you want to insert the rows into it with the correspondent column names.


5- Insert rows into the database

With this step we will just update our loop to insert every row into the database

def parseCSV(filePath):
      # CVS Column Names
      col_names = ['first_name','last_name','address', 'street', 'state' , 'zip']
      # Use Pandas to parse the CSV file
      csvData = pd.read_csv(filePath,names=col_names, header=None)
      # Loop through the Rows
      for i,row in csvData.iterrows():
             sql = "INSERT INTO addresses (first_name, last_name, address, street, state, zip) VALUES (%s, %s, %s, %s, %s, %s)"
             value = (row['first_name'],row['last_name'],row['address'],row['street'],row['state'],str(row['zip']))
             mycursor.execute(sql, value, if_exists='append')
             mydb.commit()
             print(i,row['first_name'],row['last_name'],row['address'],row['street'],row['state'],row['zip'])

As everything goes as planned, check the new data entries at the database.

Troubleshooting and practice

As we are concluded this tutorial, here are some Troubleshooting and ideas to carry on and practice

  • NaN; don't send NaN values to the database
  • Use Secure Form for best practice
  • Create and insert table/ columns based on CSV structure
  • Insert new values only
  • Set upload options: accept only CSV files, limit file size
  • Upload multiple files
  • Move finished files to done directly as soon as they are inserted
  • Better error management
  • List all done and uploaded files in index.html
  • View CSV table in index.html
  • Convert the csv into json

Full code

main.py

from flask import Flask, render_template, request, redirect, url_for
import os
from os.path import join, dirname, realpath

import pandas as pd
import mysql.connector

app = Flask(__name__)

# enable debugging mode
app.config["DEBUG"] = True

# Upload folder
UPLOAD_FOLDER = 'static/files'
app.config['UPLOAD_FOLDER'] =  UPLOAD_FOLDER


# Database
mydb = mysql.connector.connect(
  host="localhost",
  user="root",
  password="",
  database="csvdata"
)

mycursor = mydb.cursor()

mycursor.execute("SHOW DATABASES")

# View All Database
for x in mycursor:
  print(x)





# Root URL
@app.route('/')
def index():
     # Set The upload HTML template '\templates\index.html'
    return render_template('index.html')


# Get the uploaded files
@app.route("/", methods=['POST'])
def uploadFiles():
      # get the uploaded file
      uploaded_file = request.files['file']
      if uploaded_file.filename != '':
           file_path = os.path.join(app.config['UPLOAD_FOLDER'], uploaded_file.filename)
          # set the file path
           uploaded_file.save(file_path)
           parseCSV(file_path)
          # save the file
      return redirect(url_for('index'))

def parseCSV(filePath):
      # CVS Column Names
      col_names = ['first_name','last_name','address', 'street', 'state' , 'zip']
      # Use Pandas to parse the CSV file
      csvData = pd.read_csv(filePath,names=col_names, header=None)
      # Loop through the Rows
      for i,row in csvData.iterrows():
             sql = "INSERT INTO addresses (first_name, last_name, address, street, state, zip) VALUES (%s, %s, %s, %s, %s, %s)"
             value = (row['first_name'],row['last_name'],row['address'],row['street'],row['state'],str(row['zip']))
             mycursor.execute(sql, value, if_exists='append')
             mydb.commit()
             print(i,row['first_name'],row['last_name'],row['address'],row['street'],row['state'],row['zip'])

if (__name__ == "__main__"):
     app.run(port = 5000)

If you like this quick tutorial, please share it and if you have any comment please let us know.


Resources

Photo by Hitesh Choudhary on Unsplash






Hamza Mu Author: Hamza Mu

A physician with programming skills, Linux user since late 1990s, Open source supporter. Coding with Python, NodeJS (Meteor, VueJS, Express, D3, PhantomJS), SmallTalk & R language.





Read more