`
terrencexu
  • 浏览: 121496 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

XML validation for multiple schemas 验证使用多个XSD schema的XML文件

阅读更多

很多情况下我们为了优化XSD文件的可读性和可维护性,以及复用等问题的时候我们需要将schema文件拆分成多个,本文将着重关注于使用多个schema文件验证单一XML文件的问题(注: XML validation for multiple schemas)

 

下面将通过以下几个步骤演示如何使用多个schema(XSD)文件验证单一XML文件

1. 创建需要被验证的XML文件

2. 根据XML反向创建XSD文件

3. 使用多个schema验证XML文件

4. 运行测试

 

现在将逐步展开演示:

1. 创建需要被验证的XML文件

<?xml version="1.0" encoding="utf-8" ?>
<employees xmlns:admin="http://www.company.com/management/employees/admin">
	<admin:employee>
		<admin:userId>johnsmith@company.com</admin:userId>
		<admin:password>abc123_</admin:password>
		<admin:name>John Smith</admin:name>
		<admin:age>24</admin:age>
		<admin:gender>Male</admin:gender>
	</admin:employee>
	<admin:employee>
		<admin:userId>christinechen@company.com</admin:userId>
		<admin:password>123456</admin:password>
		<admin:name>Christine Chen</admin:name>
		<admin:age>27</admin:age>
		<admin:gender>Female</admin:gender>
	</admin:employee>
</employees>

 

 

2. 根据XML反向创建XSD文件

注:本文是反向生成的XSD文件,当然您可能是已经有XSD文件,那就可以直接跳过第二步了。

 

通过观察employees.xml的格式我们可以反向的创建出employees.xsd文件,但是为了快捷起见,我们可以选择使用转换工具(XML to XSD)来完成这项工作,这里我将使用trang:http://www.thaiopensource.com/relaxng/trang.html

 

首先下载最新版的trang.jar文件,然后将employees.xml和trang.jar放在同一个目录下,运行如下命令行:

java -jar trang.jar employees.xml employees.xsd

运行之后将会在当前目录下生成两个XSD文件:employees.xsd, admin.xsd, 如下:

 

employees.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" xmlns:admin="http://www.company.com/management/employees/admin">
  <xs:import namespace="http://www.company.com/management/employees/admin" schemaLocation="admin.xsd"/>
  <xs:element name="employees">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" ref="admin:employee"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

 

admin.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" targetNamespace="http://www.company.com/management/employees/admin" xmlns:admin="http://www.company.com/management/employees/admin">
  <xs:import schemaLocation="employees.xsd"/>
  <xs:element name="employee">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="admin:userId"/>
        <xs:element ref="admin:password"/>
        <xs:element ref="admin:name"/>
        <xs:element ref="admin:age"/>
        <xs:element ref="admin:gender"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="userId" type="xs:string"/>
  <xs:element name="password" type="xs:NMTOKEN"/>
  <xs:element name="name" type="xs:string"/>
  <xs:element name="age" type="xs:integer"/>
  <xs:element name="gender" type="xs:NCName"/>
</xs:schema>

 

当然你也可以自己手动的去书写XSD文件。

 

 

3. 使用多个schema验证XML文件

 

如果想验证使用单一shema的XML,应该不会遇到太多问题,示例如下:

public static boolean validateSingleSchema(File xml, File xsd) {
		boolean legal = false;
		
		try {
			SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
			Schema schema = sf.newSchema(xsd);
			
			Validator validator = schema.newValidator();
			validator.validate(new StreamSource(xml));
			
			legal = true;
		} catch (Exception e) {
			legal = false;
        	log.error(e.getMessage());
		}
		
		return legal;
	}

 

 

但是当使用多个schema验证的时候会导致无法加载classpath外部的使用<xs:import>/<xs:include>加载的XSD文件,导致如下error message:

org.xml.sax.SAXParseException: src-resolve: Cannot resolve the name 'admin:employee' to a(n) 'element declaration' component.

 

为了解决这个问题我们需要使用LSResourceResolver, SchemaFactory在解析shcema的时候可以使用LSResourceResolver加载外部资源。

代码如下:

package com.javaeye.terrencexu.jaxb;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;
import java.io.Reader;
import java.net.URI;
import java.net.URISyntaxException;

import org.apache.log4j.Logger;
import org.w3c.dom.ls.LSInput;
import org.w3c.dom.ls.LSResourceResolver;

/**
 * 
 * Implement LSResourceResolver to customize resource resolution when parsing schemas.
 * <p>
 * SchemaFactory uses a LSResourceResolver when it needs to locate external resources 
 * while parsing schemas, although exactly what constitutes "locating external resources" 
 * is up to each schema language. 
 * </p>
 * <p>
 * For example, for W3C XML Schema, this includes files &lt;include&gt;d or &lt;import&gt;ed, 
 * and DTD referenced from schema files, etc.
 *</p>
 *
 */
class SchemaResourceResolver implements LSResourceResolver {

	private static final Logger log = Logger.getLogger(SchemaResourceResolver.class);
	
	/**
	 * 
	 * Allow the application to resolve external resources. 
	 * 
	 * <p>
	 * The LSParser will call this method before opening any external resource, including 
	 * the external DTD subset, external entities referenced within the DTD, and external 
	 * entities referenced within the document element (however, the top-level document 
	 * entity is not passed to this method). The application may then request that the 
	 * LSParser resolve the external resource itself, that it use an alternative URI, 
	 * or that it use an entirely different input source. 
	 * </p>
	 * 
	 * <p>
	 * Application writers can use this method to redirect external system identifiers to 
	 * secure and/or local URI, to look up public identifiers in a catalogue, or to read 
	 * an entity from a database or other input source (including, for example, a dialog box).
	 * </p>
	 */
	public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) {
		log.info("\n>> Resolving " + "\n"
				          + "TYPE: " + type + "\n"
				          + "NAMESPACE_URI: " + namespaceURI + "\n" 
				          + "PUBLIC_ID: " + publicId + "\n"
				          + "SYSTEM_ID: " + systemId + "\n"
				          + "BASE_URI: " + baseURI + "\n");
		
		String schemaLocation = baseURI.substring(0, baseURI.lastIndexOf("/") + 1);
		
		if(systemId.indexOf("http://") < 0) {
			systemId = schemaLocation + systemId;
		}
		
		LSInput lsInput = new LSInputImpl();
		
		URI uri = null;
		try {
			uri = new URI(systemId);
		} catch (URISyntaxException e) {
			e.printStackTrace();
		}
		
		File file = new File(uri);
		FileInputStream is = null;
		try {
			is = new FileInputStream(file);
		} catch (FileNotFoundException e) {
			e.printStackTrace();
		}
		
		lsInput.setSystemId(systemId);
		lsInput.setByteStream(is);
		
		return lsInput;
	}
	
	/**
	 * 
	 * Represents an input source for data
	 *
	 */
	class LSInputImpl implements LSInput {

		private String publicId;
		private String systemId;
		private String baseURI;
		private InputStream byteStream;
		private Reader charStream;
		private String stringData;
		private String encoding;
		private boolean certifiedText;
		
		public LSInputImpl() {}
		
		public LSInputImpl(String publicId, String systemId, InputStream byteStream) {
			this.publicId = publicId;
			this.systemId = systemId;
			this.byteStream = byteStream;
		}
		
		public String getBaseURI() {
			return baseURI;
		}

		public InputStream getByteStream() {
			return byteStream;
		}

		public boolean getCertifiedText() {
			return certifiedText;
		}

		public Reader getCharacterStream() {
			return charStream;
		}

		public String getEncoding() {
			return encoding;
		}

		public String getPublicId() {
			return publicId;
		}

		public String getStringData() {
			return stringData;
		}

		public String getSystemId() {
			return systemId;
		}

		public void setBaseURI(String baseURI) {
			this.baseURI = baseURI;
		}

		public void setByteStream(InputStream byteStream) {
			this.byteStream = byteStream;
		}

		public void setCertifiedText(boolean certifiedText) {
			this.certifiedText = certifiedText;
		}

		public void setCharacterStream(Reader characterStream) {
			this.charStream = characterStream;
		}

		public void setEncoding(String encoding) {
			this.encoding = encoding;
		}

		public void setPublicId(String publicId) {
			this.publicId = publicId;
		}

		public void setStringData(String stringData) {
			this.stringData = stringData;
		}

		public void setSystemId(String systemId) {
			this.systemId = systemId;
		}
		
	}

}

 

 

最后要做的事情就是创建一个validator去封装XML验证的逻辑代码, 如下:

package com.javaeye.terrencexu.jaxb;

import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.StringWriter;
import java.util.List;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.Source;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;

import org.apache.log4j.Logger;
import org.xml.sax.SAXException;

public final class XMLParser {

	private static final Logger log = Logger.getLogger(XMLParser.class);
	
	private XMLParser() {}
	
	public static boolean validateWithSingleSchema(File xml, File xsd) {
		boolean legal = false;
		
		try {
			SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
			Schema schema = sf.newSchema(xsd);
			
			Validator validator = schema.newValidator();
			validator.validate(new StreamSource(xml));
			
			legal = true;
		} catch (Exception e) {
			legal = false;
        	log.error(e.getMessage());
		}
		
		return legal;
	}
	
	public static boolean validateWithMultiSchemas(InputStream xml, List<File> schemas) {
        boolean legal = false;
		
		try {
            Schema schema = createSchema(schemas);
            
            Validator validator = schema.newValidator();
            validator.validate(new StreamSource(xml));
            
            legal = true;
        } catch(Exception e) {
        	legal = false;
        	log.error(e.getMessage());
        }
		
		return legal;
	}
	
	/**
	 * Create Schema object from the schemas file.
	 * 
	 * @param schemas
	 * @return
	 * @throws ParserConfigurationException
	 * @throws SAXException
	 * @throws IOException
	 */
	private static Schema createSchema(List<File> schemas) throws ParserConfigurationException, SAXException, IOException {
		SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
		SchemaResourceResolver resourceResolver = new SchemaResourceResolver();
		sf.setResourceResolver(resourceResolver);
		
		Source[] sources = new Source[schemas.size()];
		
		DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
		docFactory.setValidating(false);
		docFactory.setNamespaceAware(true);
		DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
		
		for(int i = 0; i < schemas.size(); i ++) {
			org.w3c.dom.Document doc = docBuilder.parse(schemas.get(i));
			DOMSource stream = new DOMSource(doc, schemas.get(i).getAbsolutePath());
			sources[i] = stream;
		}
		
		return sf.newSchema(sources);
	}
	
}

 

4. 运行测试

 

public static void testValidate() throws SAXException, FileNotFoundException {
		InputStream xml = new FileInputStream(new File("C:\\eclipse\\workspace1\\JavaStudy\\test\\employees.xml"));
		
		List<File> schemas = new ArrayList<File>();
		schemas.add(new File("C:\\eclipse\\workspace1\\JavaStudy\\test\\employees.xsd"));
		schemas.add(new File("C:\\eclipse\\workspace1\\JavaStudy\\test\\admin.xsd"));
		
		XMLParser.validateWithMultiSchemas(xml, schemas);
	}

 

注:如果两个schema文件在同一个目录下,那么可以只传递一个主schema文件(employees.xsd)即可, SchemaResourceResolver会帮我们加载admin.xsd

5
0
分享到:
评论
2 楼 jamesatcaas 2016-06-16  
您好,这篇文章很有参考意义。我照着样例做了,但报告错误,如下:

XML文件: E:\test2\employees.xsd 通过XSD文件:E:\test2\admin.xml检验失败。
原因: s4s-elt-schema-ns: 元素 'employees' 的名称空间必须来自方案名称空间 'http://www.w3.org/2001/XMLSchema'。
org.xml.sax.SAXParseException; s4s-elt-schema-ns: 元素 'employees' 的名称空间必须来自方案名称空间 'http://www.w3.org/2001/XMLSchema'。
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:134)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:437)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:347)
at com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.reportSchemaErr(XSDHandler.java:4166)
at com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.reportSchemaError(XSDHandler.java:4145)
at com.sun.org.apache.xerces.internal.impl.xs.traversers.XSAttributeChecker.reportSchemaError(XSAttributeChecker.java:1568)
at com.sun.org.apache.xerces.internal.impl.xs.traversers.XSAttributeChecker.checkAttributes(XSAttributeChecker.java:994)
at com.sun.org.apache.xerces.internal.impl.xs.traversers.XSAttributeChecker.checkAttributes(XSAttributeChecker.java:962)
at com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDocumentInfo.<init>(XSDocumentInfo.java:106)
at com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.constructTrees(XSDHandler.java:782)
at com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.parseSchema(XSDHandler.java:620)
at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadSchema(XMLSchemaLoader.java:616)
at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:574)
at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:540)
at com.sun.org.apache.xerces.internal.jaxp.validation.XMLSchemaFactory.newSchema(XMLSchemaFactory.java:252)
at nstl.xml.schema.validator.XMLValidatorWithMultiSchemas.createSchema(XMLValidatorWithMultiSchemas.java:319)
at nstl.xml.schema.validator.XMLValidatorWithMultiSchemas.validateWithMultiSchemas(XMLValidatorWithMultiSchemas.java:60)
at nstl.xml.schema.validator.XMLValidatorWithMultiSchemas.main(XMLValidatorWithMultiSchemas.java:49)
1 楼 jarip 2013-07-09  
不错,很有参考价值,赞一个!!!

相关推荐

Global site tag (gtag.js) - Google Analytics