Showing posts with label java. Show all posts
Showing posts with label java. Show all posts

Tuesday 16 July 2013

How To Extract HTML Links With Regular Expression

// siddhu vydyabhushana // 6 comments
In this tutorial, we will show you how to extract hyperlink from a HTML page. For example, to get the link from following content :
this is text1 <a href='mkyong.com' target='_blank'>hello</a> this is text2...
  1. First get the “value” from a tag – Result : a href='mkyong.com' target='_blank'
  2. Later get the “link” from above extracted value – Result : mkyong.com

1. Regular Expression Pattern

Extract A tag Regular Expression Pattern
(?i)<a([^>]+)>(.+?)</a>
Extract Link From A tag Regular Expression Pattern
\s*(?i)href\s*=\s*(\"([^"]*\")|'[^']*'|([^'">\s]+));
Description
(		#start of group #1
 ?i		#  all checking are case insensive
)		#end of group #1
<a              #start with "<a"
  (		#  start of group #2
    [^>]+	#     anything except (">"), at least one character
   )		#  end of group #2
  >		#     follow by ">"
    (.+?)	#	match anything 
         </a>	#	  end with "</a>
\s*			   #can start with whitespace
  (?i)			   # all checking are case insensive
     href		   #  follow by "href" word
        \s*=\s*		   #   allows spaces on either side of the equal sign,
              (		   #    start of group #1
               "([^"]*")   #      allow string with double quotes enclosed - "string"
               |	   #	  ..or
               '[^']*'	   #        allow string with single quotes enclosed - 'string'
               |           #	  ..or
               ([^'">]+)   #      can't contains one single quotes, double quotes ">"
	      )		   #    end of group #1

2. Java Link Extractor Example

Here’s a simple Java Link extractor example, to extract the a tag value from 1st pattern, and use 2nd pattern to extract the link from 1st pattern.
HTMLLinkExtractor.java
package com.mkyong.crawler.core;
 
import java.util.Vector;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
 
public class HTMLLinkExtractor {
 
	private Pattern patternTag, patternLink;
	private Matcher matcherTag, matcherLink;
 
	private static final String HTML_A_TAG_PATTERN = "(?i)<a([^>]+)>(.+?)</a>";
	private static final String HTML_A_HREF_TAG_PATTERN = 
		"\\s*(?i)href\\s*=\\s*(\"([^\"]*\")|'[^']*'|([^'\">\\s]+))";
 
 
	public HTMLLinkExtractor() {
		patternTag = Pattern.compile(HTML_A_TAG_PATTERN);
		patternLink = Pattern.compile(HTML_A_HREF_TAG_PATTERN);
	}
 
	/**
	 * Validate html with regular expression
	 * 
	 * @param html
	 *            html content for validation
	 * @return Vector links and link text
	 */
	public Vector<HtmlLink> grabHTMLLinks(final String html) {
 
		Vector<HtmlLink> result = new Vector<HtmlLink>();
 
		matcherTag = patternTag.matcher(html);
 
		while (matcherTag.find()) {
 
			String href = matcherTag.group(1); // href
			String linkText = matcherTag.group(2); // link text
 
			matcherLink = patternLink.matcher(href);
 
			while (matcherLink.find()) {
 
				String link = matcherLink.group(1); // link
				HtmlLink obj = new HtmlLink();
				obj.setLink(link);
				obj.setLinkText(linkText);
 
				result.add(obj);
 
			}
 
		}
 
		return result;
 
	}
 
	class HtmlLink {
 
		String link;
		String linkText;
 
		HtmlLink(){};
 
		@Override
		public String toString() {
			return new StringBuffer("Link : ").append(this.link)
			.append(" Link Text : ").append(this.linkText).toString();
		}
 
		public String getLink() {
			return link;
		}
 
		public void setLink(String link) {
			this.link = replaceInvalidChar(link);
		}
 
		public String getLinkText() {
			return linkText;
		}
 
		public void setLinkText(String linkText) {
			this.linkText = linkText;
		}
 
		private String replaceInvalidChar(String link){
			link = link.replaceAll("'", "");
			link = link.replaceAll("\"", "");
			return link;
		}
 
	}
}

3. Unit Test

Unit test with TestNG. Simulate the HTML content via @DataProvider.
TestHTMLLinkExtractor.java
package com.mkyong.crawler.core;
 
import java.util.Vector;
 
import org.testng.Assert;
import org.testng.annotations.BeforeClass;
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;
 
import com.mkyong.crawler.core.HTMLLinkExtractor.HtmlLink;
 
/**
 * HTML link extrator Testing
 * 
 * @author mkyong
 * 
 */
public class TestHTMLLinkExtractor {
 
	private HTMLLinkExtractor htmlLinkExtractor;
	String TEST_LINK = "http://www.google.com";
 
	@BeforeClass
	public void initData() {
		htmlLinkExtractor = new HTMLLinkExtractor();
	}
 
	@DataProvider
	public Object[][] HTMLContentProvider() {
	  return new Object[][] {
	    new Object[] { "abc hahaha <a href='" + TEST_LINK + "'>google</a>" },
	    new Object[] { "abc hahaha <a HREF='" + TEST_LINK + "'>google</a>" },
 
	    new Object[] { "abc hahaha <A HREF='" + TEST_LINK + "'>google</A> , "
		+ "abc hahaha <A HREF='" + TEST_LINK + "' target='_blank'>google</A>" },
 
	    new Object[] { "abc hahaha <A HREF='" + TEST_LINK + "' target='_blank'>google</A>" },
	    new Object[] { "abc hahaha <A target='_blank' HREF='" + TEST_LINK + "'>google</A>" },
	    new Object[] { "abc hahaha <A target='_blank' HREF=\"" + TEST_LINK + "\">google</A>" },
	    new Object[] { "abc hahaha <a HREF=" + TEST_LINK + ">google</a>" }, };
	}
 
	@Test(dataProvider = "HTMLContentProvider")
	public void ValidHTMLLinkTest(String html) {
 
		Vector<HtmlLink> links = htmlLinkExtractor.grabHTMLLinks(html);
 
		//there must have something
		Assert.assertTrue(links.size() != 0);
 
		for (int i = 0; i < links.size(); i++) {
			HtmlLink htmlLinks = links.get(i);
			//System.out.println(htmlLinks);
			Assert.assertEquals(htmlLinks.getLink(), TEST_LINK);
		}
 
	}
}
Result
[TestNG] Running:
  /private/var/folders/w8/jxyz5pf51lz7nmqm_hv5z5br0000gn/T/testng-eclipse--530204890/testng-customsuite.xml
 
PASSED: ValidHTMLLinkTest("abc hahaha <a href='http://www.google.com'>google</a>")
PASSED: ValidHTMLLinkTest("abc hahaha <a HREF='http://www.google.com'>google</a>")
PASSED: ValidHTMLLinkTest("abc hahaha <A HREF='http://www.google.com'>google</A> , abc hahaha <A HREF='http://www.google.com' target='_blank'>google</A>")
PASSED: ValidHTMLLinkTest("abc hahaha <A HREF='http://www.google.com' target='_blank'>google</A>")
PASSED: ValidHTMLLinkTest("abc hahaha <A target='_blank' HREF='http://www.google.com'>google</A>")
PASSED: ValidHTMLLinkTest("abc hahaha <A target='_blank' HREF="http://www.google.com">google</A>")
PASSED: ValidHTMLLinkTest("abc hahaha <a HREF=http://www.google.com>google</a>")
Read More

Convert String With Commas To Long – Java

// siddhu vydyabhushana // 3 comments
A short guide to show you how to convert a String with commas to a long type.
1. For a normal String, you can use Long.valueOf to convert it directly.
	String bigNumber = "1234567899";
	long result = Long.valueOf(bigNumber);
2. For a String with commas, you can use java.text.NumberFormat to convert it.
	String bigNumber = "1,234,567,899";
	NumberFormat format = NumberFormat.getInstance(Locale.US);
        Number number = 0;
	try {
		number = format.parse(bigNumber);
	} catch (ParseException e) {
		e.printStackTrace();
	}
	long result = number.longValue();
3. Alternatively, if you don’t care about Locale, just replace all the commas.
	String bigNumber = "1,234,567,899";
	long result3 = Long.valueOf(bigNumber.replaceAll(",", "").toString());
Read More

Tuesday 4 December 2012

JComboBox in Swings

// siddhu vydyabhushana // 2 comments


jcombobox in swings

Hi ,this is siddhu vydyabhushana am going to explain about JComboBox in swings.It is one of the component in swings which gives you to select one from group but it was purely different from JRadioButton.It will provide dropdown for u ,we will take Items in string and put it into the JComboBox.

EventHandling
change in any state of object we known as event.
state changes will done by below function..........




JComboBox j;
j.addItemListener(new ItemListener()
{
public void ItemStateChanged(ItemEvent e)
{
//data what you want
}
});


jcombobox in swings

EXAMPLE:

import javax.swing.*;
import java.awt.*;
import java.awt.event.*;

//class name combo
public class combo
{

//constructor combo
public combo()
{

//we are accessing c1 from within inner layer so need to be declared as final
final JComboBox c1;

//frame
JFrame j=new JFrame();

//JButton
final JButton b=new JButton("what you select?");
String s[]={"javatyro","siddhu","vydyas"};

c1=new JComboBox(s);


c1.setBounds(10,10,150,30);
b.setBounds(10,50,150,30);
j.setSize(200,200);
j.setLayout(null);
j.setVisible(true);

j.add(b);
j.add(c1);
j.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);  

//Event Handling

b.addActionListener(new ActionListener()
{
public void actionPerformed(ActionEvent e)
{
String str=(String)c1.getSelectedItem();
JOptionPane.showMessageDialog(null,"You select : "+str);  
}
});
}
public static void main(String args[])
{
combo c=new combo();
}
}

OUTPUT SCREENS


jcombobox in swings
Read More

Saturday 1 December 2012

Insert Data into Database using Swings,Applets part-2

// siddhu vydyabhushana // 9 comments
Good evening guys just now i described all functions of java in part 1 .In part-2  am going to give u front end code and explaination..
follow below steps

 PART-1

http://javatyro.blogspot.in/2012/12/insert-data-into-database-using.html


SCREEN SHOT OF MY PROJECT



DOWNLOAD FULL CODING
                                   download



PACKAGES


import java.sql.*;
import javax.swing.*;
import java.awt.*;
import java.awt.event.*;

JText Field component in swings used for input fileds

JTextField jt1=new JTextField(20);
JTextField jt2=new JTextField(20);
JTextField jt3=new JTextField(20);
JTextField jt4=new JTextField(20);

JLabel

JLabel jlb1=new JLabel("Username:");
JLabel jlb1=new JLabel("password:");
JLabel jlb1=new JLabel("confirm password:");
JLabel jlb1=new JLabel("email:");

JRadioButton



JRadioButton m, f;
ButtonGroup radio=new ButtonGroup();

 m=new JRadioButton("male");
radio.add(m);//adding item into JFrame

f=new JRadioButton("female");
radio.add(f);//adding item into JFrame

JButton


JButton button=new JButton("Submit");

Adjust  OBJECTS positions in JFrame

name.setBounds(int x,int y,int height,int width);

jlb1.setBounds(40,50,70,30);//adjust label to co-ordinates
jlb2.setBounds(40,90,70,30);//adjust label to co-ordinates
jlb3.setBounds(40,130,70,30);//adjust label to co-ordinates
jlb4.setBounds(40,170,70,30);//adjust label to co-ordinates

jt1.setBounds(130,50,100,30);//adjust JText Field to co-ordinates
jt2.setBounds(130,90,100,30);//adjust JText Field to co-ordinates
jt3.setBounds(130,130,100,30);//adjust JText Field to co-ordinates
jt4.setBounds(130,170,100,30);//adjust JText Field to co-ordinates

m.setBounds(40,210,100,30);//adjust JRadioButton to co-ordinates
f.setBounds(140,210,100,30);//adjust JRadioButton to co-ordinates

button.setBounds(80,250,100,30);//adjust JButton to co-ordinates

ADD OBJECTS TO JFrame

j.add(img1);
j.add(jlb1);
j.add(jt1);
j.add(jlb2);
j.add(jt2);
j.add(jlb3);
j.add(jt3);
j.add(jlb4);
j.add(jt4);
j.add(m);
j.add(f);
j.add(button);
j.add(text);

EVENT HANDLING

button.addActionListener(new ActionListener()
{
public void actionPerformed(ActionEvent e)
{
//data here
}
});

VALIDATIONS

String name=jt1.getText();
String pass=jt2.getText();
String cpass=jt3.getText();
String email=jt4.getText();
String g=text.getText();

if((name.length()<6)||(name.length()>=15))
{
JOptionPane.showMessageDialog(null,"name field lessthan 15 and greaterthan 6 ");
jt1.setText("");
}
else if(!pass.equals(cpass))
{
JOptionPane.showMessageDialog(null,"password didn't matched");
}
else if(email.length()<10)
{
JOptionPane.showMessageDialog(null,"enter your correct id");
}
else if((m.isSelected()==false)&&(f.isSelected()==false)){
JOptionPane.showMessageDialog(null,"Please select radio button");
}
}

HOW TO POP UP DIALOG


JOptionPane.showMessageDialog(null,"Please select radio button");
Read More

Monday 26 November 2012

Types of JDBC Drivers

// siddhu vydyabhushana // Leave a Comment
my first post ON JDBC is 

JDBC TUTORIAL PART-1

there are four types of drivers in jdbc ,what is the use of 4 drivers, what type of driver we are going to use i will explain .


1)JDBC-ODBC bridge driver

    is used to connect to the database.the JDBC-ODBC bridge driver converts JDBC method calls into ODBC function calls.

2)NATIVE-API DRIVER

3)NETWORK PROTOCOL DRIVER


4)THIN DRIVER





Read More

Sunday 25 November 2012

java database connectivity tutorial

// siddhu vydyabhushana // 2 comments
JDBC

jdbc is a java API that is used to connect and execute query to the database .JDBC API uses jdbc drivers to connect to the database.

why use JDBC?

before JDBC,ODBC API was used to connect and execute query to the database .But ODBC API uses ODBC driver that is implemented in c , which is platform dependent and unsecured . So sun micro systems defined own API

WHAT IS API?

Application Programming Interface is a document that contains description of all featutres of a s/w.It represents classes and interfaces that software programs can follow to communicate .An API can be created for Apps, Libraries, Operation Sysytem .. etc



Read More