In this article I’m going to explain how to import records from HTML and store to database.

      I got one requirement that I’ve to import records from simple HTML table and store those records to the database. It’s not complicated one but we should know about that whether that HTML page is well formatted.

          This is sample HTML table. It looks like normal table but it contains style properties

CustomerID

CustomerName

ContactName

Address

City

PostalCode

Country

1

Alfreds Futterkiste

Maria Anders

Obere Str. 57

Berlin

12209

Germany

2

Ana Trujillo Emparedados

Ana Trujillo

Avda. de la Constitución 2222

México D.F.

05021

Mexico

3

Antonio Moreno Taquería

Antonio Moreno

Mataderos 2312

México D.F.

05023

Mexico

4

Around the Horn

Thomas Hardy

120 Hanover Sq.

London

WA1 1DP

UK

5

Berglunds snabbköp

Christina Berglund

Berguvsvägen 8

Luleå

S-958 22

Sweden

 

Sample source code for HTML table:

<tr style="background-color: rgb(255, 255, 255);">

            <td style="line-height: 16px; font-family: verdana, helvetica, arial, sans-serif; font-size: 12px;">

                1<br />

                <br />

            </td>

            <td style="line-height: 16px; font-family: verdana, helvetica, arial, sans-serif; font-size: 12px">

                Alfreds Futterkiste</td>

            <td style="line-height: 16px; font-family: verdana, helvetica, arial, sans-serif; font-size: 12px">

                Maria Anders</td>

            <td style="line-height: 16px; font-family: verdana, helvetica, arial, sans-serif; font-size: 12px">

                Obere Str. 57</td>

            <td style="line-height: 16px; font-family: verdana, helvetica, arial, sans-serif; font-size: 12px">

                Berlin</td>

            <td style="line-height: 16px; font-family: verdana, helvetica, arial, sans-serif; font-size: 12px">

                12209</td>

            <td style="line-height: 16px; font-family: verdana, helvetica, arial, sans-serif; font-size: 12px">

                Germany</td>

        </tr>

 

Here we need to make sure that how we can get content from simple HTML table.  So only possibility is we have to download HTML string and extract <table> content. <table> tag contains <tr> and <td> tags but it may have style properties. So first we have to avoid these style properties and after we can get required content from table. Please follow the steps I’ve written here.

          First we have to upload HTML page to server folder. Then we can download HTML string from that HTML page.

Sample code:

          WebClient wc = new WebClient();

string fileContent = wc.DownloadString(url);

           

          Here we have to format HTML tags for avoid style properties

Sample code:

         const string msgFormat = "table[{0}], tr[{1}], td[{2}], a: {3}, b: {4}";   

       const string table_pattern = "<table.*?>(.*?)</table>"; 

       const string tr_pattern = "<tr.*?>(.*?)</tr>"; 

       const string td_pattern = "<td.*?>(.*?)</td>"; 

       const string a_pattern = "<a href=\"(.*?)\"></a>"; 

       const string b_pattern = "<b>(.*?)</b>";

 

          After through looping we can find <tr> and <td> elements. Then we can get content within <td></td> tags by using this method.

    private static List<string> GetContents(string input, string pattern)

    { 

        MatchCollection matches = Regex.Matches(input, pattern, RegexOptions.Singleline); 

        List<string> contents = new List<string>(); 

        foreach (Match match in matches) 

        contents.Add(match.Value); 

        return contents; 

    }

           Then we can insert imported records to database by each row. 

Designer source code:

<%@ Page Language="C#" AutoEventWireup="true" CodeFile="Default.aspx.cs" Inherits="_Default"

    EnableEventValidation="true" %>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<head runat="server">

    <title></title>

</head>

<body>

    <form id="form1" runat="server">   

    <div>

        <table>        

            <tr>

                <td>

                    <asp:FileUpload ID="FileUpload1" runat="server" />

                </td>

                <td>

                    <asp:Button ID="Button1" runat="server" Text="submit" OnClick="button1_Click" />

                </td>

            </tr>

            <tr>

                <td colspan="2">

                <br />

                    <asp:GridView ID="gvCustomers" runat="server">                

                    </asp:GridView>

                </td>

            </tr>        

            </table>

    </div>

    </form>

</body>

</html>

           Here we have to use following Namespaces

 using System.Net;

using System.IO;

using System.Text.RegularExpressions; 

C# code:

using System;

using System.Collections.Generic;

using System.Linq;

using System.Web;

using System.Web.UI;

using System.Web.UI.WebControls;

using System.Web.UI.Adapters;

using System.Collections;

using System.Text.RegularExpressions;

using System.Data;

using System.Data.SqlClient;

using System.Net;

using System.IO;

 public partial class _Default : System.Web.UI.Page

{

    //please refer connection string from web config.  

    SqlConnection conn = new SqlConnection("Data Source=SPIDER;Initial Catalog=Northwind;Integrated Security=True");

 

    protected void Page_Load(object sender, EventArgs e)

    {

        if (!IsPostBack)

        {

            BindGrid();

        }

    }

     const string msgFormat = "table[{0}], tr[{1}], td[{2}], a: {3}, b: {4}";   

    const string table_pattern = "<table.*?>(.*?)</table>"; 

    const string tr_pattern = "<tr.*?>(.*?)</tr>"; 

    const string td_pattern = "<td.*?>(.*?)</td>"; 

    const string a_pattern = "<a href=\"(.*?)\"></a>"; 

    const string b_pattern = "<b>(.*?)</b>";

 

    private static List<string> GetContents(string input, string pattern)

    {

        MatchCollection matches = Regex.Matches(input, pattern, RegexOptions.Singleline);

        List<string> contents = new List<string>();

        foreach (Match match in matches)

        contents.Add(match.Value);

        return contents;

    }

   

    protected void button1_Click(object sender, EventArgs e)

    {       

        WebClient wc = new WebClient();

        string uploadfile = FileUpload1.FileName;

        string filepath = Server.MapPath("~/Document/" + uploadfile);     

        FileUpload1.SaveAs(filepath);

        // here you could get virtual path of the html file

        //for testing you could change port number as per your localhost

        string url = "http://localhost:7331/Html-DB/Document/" + uploadfile;

        string fileContent = wc.DownloadString(url);

 

        List<string> tableContents = GetContents(fileContent, table_pattern);       

        int tableIndex = 0;

        List<string> list = new List<string>();

 

        foreach (string tableContent in tableContents)

        {

            List<string> trContents = GetContents(tableContent, tr_pattern);

            int trIndex = 0;

 

            foreach (string trContent in trContents)

            {

                List<string> tdContents = GetContents(trContent, td_pattern);

                string[] columns = new string[tdContents.Count];

 

                foreach (string tdContent in tdContents)

                {

                    string result = Regex.Replace(tdContent, @"<[^>]*>", string.Empty);

                    list.Add(result);

                }

                if (list.Count > 0)

                {

                    string sql = "INSERT INTO Customers_new(CustomerID, CompanyName, ContactName, Address, City, PostalCode, Country)";

                    sql += "VALUES('" + list[0].Trim() + "','" + list[1].Trim() + "','" + list[2].Trim() + "','" + list[3].Trim() + "','" + list[4].Trim() + "','" + list[5].Trim() + "','" + list[6].Trim() + "')";

                    conn.Open();

                    SqlCommand cmd = new SqlCommand(sql, conn);

                    cmd.ExecuteNonQuery();

                    conn.Close();

                }

                list.Clear();

                trIndex++;

            }

            tableIndex++;

        }

        BindGrid();

    }

 

    private void BindGrid()

    {

        DataSet ds = new DataSet();

        conn.Open();

        string cmdstr = "Select CustomerID,CompanyName,ContactName,Address,City,PostalCode,Country from Customers_new";

        SqlCommand cmd = new SqlCommand(cmdstr, conn);

        SqlDataAdapter adp = new SqlDataAdapter(cmd);

        adp.Fill(ds);

        gvCustomers.DataSource = ds;

        gvCustomers.DataBind();    

    }

}



View demo Download
http://www.dotnetfox.com/Document/1053/0Demo1.gif

If you enjoyed this article, get email updates (it's free).


Related Articles
  • Parse or read CSV file using TextFieldParser in ASP.NET CSV file contains multiple values which are separated by commas (,). In order to parse CSV file, we need to write some code. TextFieldParser class in the Microsoft.VisualBasic.FileIO namespace is used to parse CSV files. TextFieldParser reads in CSV files. With it, we specify a delimiter string, and then can read in the fields of every line in a loop. While parsing CSV file, it doesn’t fetch heade...
  • Import records from excel sheet to SQL Server using ASP.NET and C#In this article I’m going to explain how to import records from excel sheet to SQL Server using ASP.NET and C#. I got a requirement that I should import records from excel sheet to SQL Server. To achieve that I have created demo and you could follow the steps given below.
  • Import records from excel sheet to SQL Server with validations in ASP.NET using C#In this article I’m going to explain how to import records from excel sheet to SQL Server with validations in ASP.NET using C#. I have already written an article for imports records from excel sheet to SQL Server. Import records from excel sheet to SQL Server using ASP.NET and C#. Here I’ll show you how to do validations in Excel sheet while importing records.
  • Import records from HTML table and store to database in ASP.NET using C#In this article I’m going to explain how to import records from HTML and store to database. I got one requirement that I’ve to import records from simple HTML table and store those records to the database. It’s not complicated one but we should know about that whether that HTML page is well formatted. Here we need to make sure that how we can get content from simple HTML table. So only possibilit...

Comments
comments powered by Disqus

Thanks for downloading..!!

×
Subscribe via Email
    
message to display in fancybox