|
|
|
Importing CSV file into Database with Schema.ini |
|
Posted by
Moderator1
on
8/30/2008 3:32:35 AM
|
Category:
Asp.Net 2.0 |
|
|
Total Views :
76724 |
|
Adding to your Favorites....
|
|
|
 |
|
|
|
|
|
|
Introduction
|
|
In real time web applications, there come several situations to import data from
an external data sources such as Excel file, CSV file, text file etc. Asp.Net Framework
provides a simple solution to import the data from the external sources to the database
with the Microsoft.Jet.OLEDB provider. But difficulties arise when the external
data file contains, many columns with different data types. Basically, the provider
cannot differentiate data types between the columns or the rows, blindly it will
consider them as a data type based on first few rows and leave all the data which
does not match the data type. To overcome this problem, we use schema.ini file to
define the data type of the CSV or text file and allow the provider to read that
and recognize the exact data types of each column. And this article mainly concentrates
on how to import a CSV file with different data types into database, using Microsoft
Jet Database Engine and also with Schema.ini. |
|
|
What is Schema.ini file?
Schema.ini is a information file, used to define the data structure and format of
each column that contains data in the CSV file. If schema.ini file exists in the
directory, Microsoft.Jet.OLEDB provider automatically reads it and recognizes the
data type information of each column in the CSV file. Thus, the provider intelligently
avoids the misinterpretation of data types before inserting the data into the database.
Points to remember before creating Schema.ini
1. The schema information file, must always named as 'schema.ini'.
2. The schema.ini file must be kept in the same directory where the CSV file exists.
3. The schema.ini file must be created before reading the CSV file.
4. The first line of the schema.ini, must the name of the CSV file, followed by
the properties of the CSV file, and then the properties of the each column in the
CSV file.
For demonstration, consider a CSV file with 5 columns such as Date, First Name,
Last Name, Age and Salary. The data structure of these columns are given below
|
|
Column Name |
Data Type |
Width |
Format |
|
Date |
DateTime |
|
dd-MMM-yyyy |
|
First Name |
Text |
100 |
|
|
Last Name |
Text |
100 |
|
|
Age |
Long |
|
|
|
Salary |
Double |
|
|
|
Let us consider that we need to upload the CSV file in two different scenarios,
with same CSV file name and with different CSV file name whenever we upload.
Import a CSV file every time with the SAME CSV file name.
If you want to import a CSV file every time with same file name, just do the following
steps.
1. Open a Notepad file.
2. copy and paste the below content in the notepad and replace 'YourCSVFileName.csv'
with your CSV file name.
|
[YourCSVFileName.csv]
ColNameHeader=True
Format=CSVDelimited
DateTimeFormat=dd-MMM-yyyy
Col1=A DateTime
Col2=B Text Width 100
Col3=C Text Width 100
Col4=D Long
Col5=E Double |
|
3. Save the Notepad file as 'schema.ini' in the same location of your CSV file.
4. In your web page, place a FileUpload control, an Upload button and a GridView
control. In the Upload button's click event write the following code.
|
if (filUpload.HasFile)
{
FileInfo fileinfo = new FileInfo(filUpload.PostedFile.FileName);
string strCsvFilePath = Server.MapPath("MyCSVFolder") + "\\" + fileinfo.Name;
//Save the CSV file in the Server inside 'MyCSVFolder'
filUpload.SaveAs(strCsvFilePath);
//Fetch the location of CSV file
string strFilePath = Server.MapPath("MyCSVFolder") + "\\";
string strSql = "SELECT * FROM [" + fileinfo.Name + "]";
string strCSVConnString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
+ strFilePath + ";" + "Extended Properties='text;HDR=YES;'";
// load the data from CSV to DataTable
OleDbDataAdapter oleda = new OleDbDataAdapter(strSql, strCSVConnString);
DataTable dtbCSV = new DataTable();
oleda.Fill(dtbCSV);
//Display data in a GridView control
GridView1.DataSource = dtbCSV;
GridView1.DataBind();
} |
|
5. Now you can insert the data in the DataTable into the database in your own way.
Import a CSV file every time with DIFFERENT CSV file name or
Dynamically creating Schema.ini File
If you want to upload a CSV file with different CSV file name every time, then you
have to create the schema.ini file dynamically using the FileStream and StreamWriter
object before you import the CSV file. To know how, follow the simple steps below.
In your web page, place a FileUpload control, an Upload button and a GridView control.
In the Upload button's click event write the following code.
|
if (filUpload.HasFile)
{
FileInfo fileinfo = new FileInfo(filUpload.PostedFile.FileName);
string strCsvFilePath = Server.MapPath("MyCSVFolder") + "\\" + fileinfo.Name;
filUpload.SaveAs(strCsvFilePath);
string strFilePath = Server.MapPath("MyCSVFolder");
using (FileStream filestr = new FileStream(strFilePath + "\\schema.ini",
FileMode.Create, FileAccess.Write))
{
using (StreamWriter writer=new StreamWriter(filestr))
{
writer.WriteLine("[" + fileinfo.Name + "]");
writer.WriteLine("ColNameHeader=True");
writer.WriteLine("Format=CSVDelimited");
writer.WriteLine("DateTimeFormat=dd-MMM-yy");
writer.WriteLine("Col1=A DateTime");
writer.WriteLine("Col2=B Text Width 100");
writer.WriteLine("Col3=C Text Width 100");
writer.WriteLine("Col4=E Long");
writer.WriteLine("Col5=F Double");
writer.Close();
writer.Dispose();
}
filestr.Close();
filestr.Dispose();
}
string strSql = "SELECT * FROM [" + fileinfo.Name + "]";
string strCSVConnString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
+ strFilePath + ";"
+ "Extended Properties='text;HDR=YES;'";
OleDbDataAdapter oleda = new OleDbDataAdapter(strSql, strCSVConnString);
DataTable dtbBankStmt = new DataTable();
oleda.Fill(dtbBankStmt);
GridView1.DataSource = dtbBankStmt;
GridView1.DataBind();
} |
|
|
|
The above code uses FileInfo object to get the file name of the CSV file to write
in the schema.ini file. Using the FileUpload control, we save the CSV file in the
server’s directory “MyCSVFolder”. The schema.ini file has also to be created in
this same directory. By using the FileStream object, we create the schema.ini file,
and then by using StreamWriter object, we write the content of the schema.ini file.
Rest everything is simple to read the CSV file, with the Microsoft.Jet.OLEDB provider,
load the content into a DataTable, then data manipulation can be done. For this
article purpose, we displayed the data in a GridView control.
|
|
|
|
Performance tips to import data is, pass the CSV file’s work sheet
name between square brackets (“[]”) in the SELECT statement. This will avoid the
problem if the CSV file’s work sheet name has two or more words separated by spaces.
|
|
Understanding the Schema.ini file line by line
Line 1: Name of the CSV file enclosed between square brackets
(‘[‘ and ‘]’ ).
Line 2: Specify the CSV file contains column header in
first row. If column header exists, then specify ColNameHeader as True, otherwise
False.
Line 3: Specify the CSV file delimited format. It can
be CSVDelimited, TabDelimited, Delimited(), FixedLength.
Line 4: If your CSV file contains any date column, then
specify the format of the Date value. Furthermore, if the CSV file contains any
other columns that has to be formatted properly before data manipulation, then it
can be specified here.
Line 5 and above: Specify each column’s Name, Data type,
Width if applicable. The general syntax is
|
|
|
|
Col(n)=<column name> <data type> <Width width> |
|
Where n is the position of the column in the CSV file, and Width is mandatory only
for Text.
The main concept of this article is to provide the knowledge of reading, importing
or exporting a CSV file with the help of schema.ini file. The scope the schema.ini
file can be extendable to any limit.
To learn more about Schema.ini, please refer
http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|