Sanitize with Whitelist

Any characters which are not part of an approved list can be removed, encoded or replaced.

 
The most common web application security weakness is the failure to properly validate input from the client or environment. This weakness leads to almost all of the major vulnerabilities in applications, such as Interpreter Injection, locale/Unicode attacks, file system attacks and buffer overflows. Data from the client should never be trusted for the client has every possibility to tamper with the data.
In many cases, Encoding has the potential to defuse attacks that rely on lack of input validation.
For example:-
If you use HTML entity encoding on user input before it is sent to a browser, it will prevent most XSS attacks. However, simply preventing attacks is not enough - you must perform Intrusion Detection in your applications. Otherwise, you are allowing attackers to repeatedly attack your application until they find a vulnerability that you haven't protected against. Detecting attempts to find these weaknesses is a critical protection mechanism.

 

Accept known good

This strategy is also known as "whitelist" or "positive" validation. The idea is that you should check that the data is one of a set of tightly constrained known good values. Any data that doesn't match should be rejected. Data should be:

  • Strongly typed at all times
  • Length checked and fields length minimized
  • Range checked if a numeric
  • Unsigned unless required to be signed
  • Syntax or grammar should be checked prior to first use or inspection

If you expect a postcode, validate for a postcode (type, length and syntax):

public String isPostcode(String postcode) {
return (postcode != null && Pattern.matches("^(((2|8|9)\d{2})|((02|08|09)\d{2})|([1-9]\d{3}))$", postcode)) ? postcode : "";
}


Coding guidelines should use some form of visible tainting on input from the client or untrusted sources, such as third party connectors to make it obvious that the input is unsafe:

 
String taintPostcode = request.getParameter("postcode");
ValidationEngine validator = new ValidationEngine();
boolean isValidPostcode = validator.isPostcode(taintPostcode);


Reject known bad

This strategy, also known as "negative" or "blacklist" validation is a weak alternative to positive validation. Essentially, if you don't expect to see characters such as %3f or JavaScript or similar, reject strings containing them. This is a dangerous strategy, because the set of possible bad data is potentially infinite. Adopting this strategy means that you will have to maintain the list of "known bad" characters and patterns forever, and you will by definition have incomplete protection.

 
public String removeJavascript(String input) 
{
Pattern p = Pattern.compile("javascript", CASE_INSENSITIVE);
p.matcher(input);
return (!p.matches()) ? input : '';
}


It can take upwards of 90 regular expressions (see the CSS Cheat Sheet in the Development Guide 2.0) to eliminate known malicious software, and each regex needs to be run over every field. Obviously, this is slow and not secure. Just rejecting "current known bad" (which is at the time of writing hundreds of strings and literally millions of combinations) is insufficient if the input is a string. This strategy is directly akin to anti-virus pattern updates. Unless the business will allow updating "bad" regexes on a daily basis and support someone to research new attacks regularly, this approach will be obviated before long.


SANITIZE


public static String [] sanitizedData ( String... input )
{
String sanitizedData[] = new String[input.length];
int index = 0;

for ( String i : input )
{
sanitizedData[index++] = i.replaceAll ( "[\'~!@#$%^&*()\";: <>?/,.]", "’" );
}
return sanitizedData;
}


public static String [] sanitizedDatawithQuote ( String... input )
{
String sanitizedData[] = new String[input.length];
int index = 0;

for ( String i : input )
{
i = i.replaceAll ( "<", "<" );
i = i.replaceAll ( ">", ">" );
//etc..
sanitizedData[index] = i;

}
return sanitizedData;
}



//calling example


/*
String sanitizedData[] = SecureAlgorithm.sanitizedData ( "vijay\'~ !@#$%^&*()\";:<>?/,.", "jessy%" );
String sanitizedDatawithQuote[] = SecureAlgorithm.sanitizedDatawithQuote ( "vijay<>" );
System.out.println ( sanitizedData[0] + "\n" + sanitizedData[1] );
System.out.println ( sanitizedDatawithQuote[0] );

String taintPostcode = request.getParameter("postcode");
ValidationEngine validator = new ValidationEngine();
boolean isValidPostcode = validator.isPostcode(taintPostcode);
*/