Cleaning up
SOAP Web Services
Speaker Qualifications
- Developer with over 30 years experience
- Systems software, data communications
- Wide range of commercial products
- Strictly Java for 6+ years
- Enterprise Java and XML for 4+ years
- Member JAXB & JAX-RPC 2.0 Expert Groups
- JavaWorld and IBM developerWorks author
- President and lead consultant for S3i
Outline of Talk
- From XML-RPC to WS-I BP 1.0a
- Web services in Java
- Alternative approaches using JAX-RPC and Axis
- Next-generation web services frameworks
- Web service performance
- Using attachments with web services
- Securing and hardening web services
The roots of the hairball
- Many different ways to expose applications:
- Direct socket connection (custom data formats)
- Remote Procedure Call (RPC)
- CORBA cross-language calls (or DCOM)
- All have drawbacks for interoperability
- XML seems ideal as a way around problems
RPC with XML
- XML-RPC the simple form of XML web services
- Remote calls with XML encoding
- Less efficient than binary, but much more convenient
- Developed by Dave Winer of Userland
- In cooperation with Microsoft (SOAP)
- Publically release while SOAP still in development
- Largely "self-describing"
XML-RPC structure
Why not use XML-RPC?
- Limited expressiveness:
- Only a few data types
- Handles nested structure trees, but not graphs
- Not very "XMLish"
- Rigid format doesn't allow for general XML
- In-band type information bulky and redundant
- Microsoft wanted more...
SOAP basics
- SOAP more-complex version of same approach:
- More sophisticated encoding scheme
- Namespaces to keep components clear
- Microsoft-sponsored effort for 1.0
- IBM joined in promoting as standard in 1.1
- Basis for .NET marketing campaign
- What most people mean by “web services”
What is SOAP?
- Defines wrapper for application data
- Actual data format can vary:
- Specification defines one encoding (rpc)
- Applications can define their own (or none)
- Actual connection can vary:
- Usually request-response structure
- Transport normally HTTP, but can be anything (SMTP, etc.)
SOAP Message Structure
- Envelope is wrapper for content, but no useful information
- Optional header can contain control information
- Body contains actual data in XML encoding
- Attachments can hold other types of data (binary, unencoded text, etc.)
RPC encoded SOAP
- Remote Procedure Call – like CORBA or RMI
- Call parameters and result encoded in body
- Encoding allows for graphs of simple objects
- As with CORBA, allows [in/out], and [out] parameters returned in addition to return value
- Optionally used by .NET
Document literal
- An XML document is passed each way
- Flexible, but less automatic for code:
- Working with XML documents, not method calls
- Requires code at each end to convert to data
- Equivalent to direct XML interface embedded in SOAP
Wrapped
- Microsoft approach for easy service interface:
- Given a call with parameters "a", "b" and "c"
- Define a top-level element with children "a", "b" and "c"
- Pass the element as document literal SOAP body
- Uses Microsoft encoding scheme
- Preferred .NET approach
- Represented as doc/lit
- Limited by the encoding
The WSDL additive
- Web Services Description Language
- Based on abstract definitions and bindings:
- types -- types defined for use in messages
- messages -- the data being exchanged
- port types -- collections of operations
- binding -- concrete protocol and data format
- service -- a collection of bound ports at address
- Allows automatic proxy generation, etc.
Types and messages
- Types optionally define structures for exchange (with embedded schema definitions)
- Message represents one interaction
- One or more "parts" in message
- Describes the content to be sent in an operation
- Can make use of type definitions (or schema types)
Port types
- Abstract operations, of four possible types:
- One way port (only input message defined)
- Request-response port (input message + output message)
- Solicit-response port (output message + input message)
- Notification port (as in publish/subscribe) (output message)
Messsage(s) involved in each operation
Bindings
- Defined protocol-specific binding data
- Underlying transport protocol
- Communication style for SOAP
- Each binding references one port type
Service and port definition
- Service gives name for a set of ports
- Each port definition specifies a single address for a binding
SOAP/WSDL Applications
- Web page equivalent services:
- Information retrieval (quotes, weather, prices, etc.)
- Calculations (interest payments, conversions, etc.)
- Simple actions (messaging, registration, etc.)
- .NET "next wave of Internet" idea a flop:
- Exposing services to all and sundry a security risk
- Dubious advantages over HTML "web services"
- What would service consumers pay for?
Real-world usage
- Behind the firewall -- hotter all the time
- .NET approach for client-server
- Connecting internal enterprise applications
- CORBA-equivalent for cross-platform linkage
- Growing use for private access:
- Expose applications directly to customers
- Can use PKI, dedicated IP addresses, etc.
- Also for limited range of public-access services
Reality check
- Interoperability the purpose of SOAP/WSDL
- Still problems in many areas:
- Session support
- Security
- Attachments
- Even basic formats an issue...
- SOAP encoding a primitive form of data binding
- Cross-language and cross-platform, but limited
Encoding issues
- Data types not cross-language:
- Java has no unsigned types, so no JAX-RPC support
- Incompatibilities with char, floating point, etc.
- Complex structures even more of a problem:
- Constructs such as HashMap have no standard form
- Custom serializers / deserializers needed everywhere
- Incompatible quirks even in SOAP encoding
- Use xsi:type or not?
- multiRef handling
Too many choices!
- Web services intended for interoperability
- Difficult to support all possible choices
- Results in incompatibility
- Interoperation requires constant testing
- Microsoft makes their choices, standard or not
- (Everybody else kludges to work with Microsoft)
WS-I
- "Web Services Interoperability Organization"
- Original primary backers IBM, Microsoft, BEA, etc.
- Selecting options rather than writing standards
- Basic Profile a first goal:
- Decide on "best practices"
- Reduce choices to promote interoperability
- Basic Profile 1.0a final as of August
WS-I Basic Profile
- Builds on wide range of standards
- Limits choices to increase interoperability:
- Restricts XML usage (encodings, components)
- Standardizes HTTP response codes
- Allows services to require SSL/TLS
- And the biggie -- prohibits encodingStyle
- rpc/enc not allowed
- rpc/lit allowed, but basically useless
- doc/lit the clear winner
JAX-RPC
- Java API for XML-based RPC
- Portable and interoperable web services using Java
- Developed through Java Community Process
- Core technology for J2EE 1.4
- JSR-109 defines web services deployment descriptors
- With EJB 2.1 supports direct stateless session bean web services
- Now available as 1.1 release
What it does
- JAX-RPC maps SOAP/WSDL to RMI:
- WSDL port equivalent to remote interface (interface extending java.rmi.Remote)
- WSDL operation to remote interface method
- Methods throw java.rmi.RemoteException (SOAP Faults converted to/from RemoteExceptions)
- Uses rpc/enc style or direct doc/lit mapping
- Supports a subset of RMI operation
WS-I BP impact
- JAX-RPC designed around RPC encoding
- Everything shaped to RMI model
- Allows document/literal, but with limitations:
- Automatic conversion for subset of SOAP encoding
- Main difference is handling only trees, not graphs
- Requires JavaBeans (usually generated) to match XML structure
- Other cases use javax.xml.soap.SOAPElement
- Method needs to process DOM-style tree of objects
- Some changes needed
JAX-RPC future
- JAX-RPC 1.1 includes WS-I BP support
- Compliant with all conditions of BP
- Still same limits on working with document/literal
- JAX-RPC 2.0 the coming solution
- Unfortunately, probably not available until 2005
- Planned to work with JAXB 2.0 for data binding
- A problem for data-oriented SOAP support
- Not a problem for simple structures
- Not a problem if you can work directly with DOM
Data binding SOAP
- Data binding maps XML to / from your objects
- Ideal for document/literal SOAP
- (Usually) fast and (usually) memory-efficient
- Application gets convenient access to data
- JAX-RPC headed that way
- But will it be a good solution?
- And what about between now and when
JAXB/JAX-RPC 2.0 is ready?
Data binding SOAP
- Current free support limited:
- JAX-RPC has none (except internal projects at Sun)
- Apache Axis can be kludged for Castor
- Commercial implementation use built-in handling
- Fundamentally different from rpc-enc approach
- Best solution may be new frameworks
JiBX SOAP framework
- Simple framework for document/literal SOAP
- No support for RPC/encoded (and never will be)
- Requires binding definition for mapping
- But can generate binding and initial classes from schema
- Simple configuration file for service
- Generates WSDL based on schema and configuration
Example application
- Earthquake information Web service
- Query by date, location, magnitude, etc.
- Returns results sorted by area
- Region information returned when used
- Variations tested:
- JAX-RPC RI -- WSDL doc/lit to Java code
- Axis -- WSDL rpc/enc to Java, WSDL doc/lit to Java,
and using Castor for data binding
- JiBX SOAP -- direct data binding
Axis-Castor version
- Requires some assembly...
- Start with simplified doc/lit WSDL
- Define fake structures for request/response
- Serve as placeholders in code generation
- Generate code as for normal doc/lit
- Blend in Castor generated code from original schema
- Modify server type mappings in deploy.wsdd
- Change generated client code to use Castor classes
- Then normal Axis deployment
Performance test
- Use pseudo-random sequence for queries
- Tune request ranges for different densities:
- Very low -- 400 queries with just 88 matching quakes
- Medium -- 100 queries with 853 matching quakes
- High -- 20 queries with 3052 matching quakes
- Very high -- 10 queries with 6155 matching quakes
- Verify number of quakes returned for each, etc.
Performance
Why the differences?
- Axis rpc/enc much bulkier than doc/lit used
- Includes xsi:type information by default
- Redeclares namespaces repeatedly
- JiBX data binding much faster than Castor or doc/lit binding in Axis and JAX-RPC
- Commercial products probably in between
- Glue and WASP declined benchmark permission
SOAP Attachments
- SOAP Body content issues:
- Character set restricted in XML
- XML must be well formed
- Binary data must be encoded with heavy overhead
- Attachments provide way to avoid these issues:
- SOAP with Attachments (SwA, MIME-based)
- Direct Internet Message Exchange (DIME)
- Message Transmission Optimization Mechanism (MTOM)
SwA
- Early proposal from Microsoft for standard
- Based on existing MIME standard
- Attachments follow actual SOAP Envelope
- Use boundaries string, content type, for each block
- Supported by Sun for Java with SAAJ
- SOAP with Attachments API for Java
- Integrated with JAX-RPC
- Not currently supported by Microsoft
- Usefulness thereby highly limited
DIME
- More recent proposal from Microsoft for standard
- Uses binary header with data length
- Attachments again follow SOAP envelope
- No standard support for Java
- Individual implementations may include (e.g., Axis)
- Microsoft supports in advanced services pack
MTOM
- Protocol layering is such a 20th Century idea
- Make everything part of the XML Infoset
- Let the code figure out how best to serialize
- Advantage is that application can ignore attachments
- Disadvantage is that application has no control over attachments
- Good idea or bad, it's the coming thing
WS-I Basic Profile 1.1
- Basic Profile 1.0a addressed SOAP/WSDL usage
- Basic Profile 1.1 addresses attachments:
- Referenced Attachments Profile 1.0 uses SwA
- Defines a special type for attachment references:
- Won't be supported by Microsoft, so largely irrelevant
Attachments summary
- No one standard
- Not all SOAP implementations support any form
- Some support SwA, some DIME, some both
- Not yet usable for general-purpose interfaces
- Use to meet requirements when you can
- Must be able to restrict client pool
SOAP security
- Sensitive applications growing, security crucial
- WS-I BP adopts (but doesn't require) TLS/SSL
- SSL widely implemented and widely supported
- Services may require SSL and use HTTPS endpoint
- Same as browser secure connection
- Negotiation between client and server assures secrecy
- May also require mutual authentication
- Separate certificates for each end
- Assures the client is who you think it is
SSL SOAP
- Basic SSL provides transport confidentiality
- Supported by most implementations
- Generally requires only server configuration
- Operates transparently (but with added value)
- Mutual authentication SSL for SOAP
- Generally just a flag setting for server
- Not all clients are set up to support
- Reasonably fast and highly secure
- But only good for point-to-point
Beyond point-to-point
- Consider complex distributed application
- Order from customer to store on credit card
- Store needs to see order information, not credit card
- Can pass encrypted credit card info on to bank
- Perhaps use digital signature to authorize payment
- Point-to-point security not enough
What about WS-Security?
- Now an OASIS standard
- Supports wide range of security features
- Two basic aspects:
- Assuring message confidentiality (encryption)
- Assuring message integrity and authenticity (signing)
- Uses header fields ("tokens") for security information
- Targeted to a particular recipient (intermediate or end)
- Extensible to support all types of security tokens
- Can build on HTTPS/SSL for transport security
WS-Security details
- Message signing support
- Based on XML Signature
- Multiple signatures across all or parts of document
- Message confidentiality support
- Based on XML Encryption
- Multiple encryptions across all or parts of document
- Associated standards for details (x509, user name, etc.)
XML Signature
- XML Signature can be applied to any data
- Portion of XML document
- Entire XML document
- Data external to document (but accessible)
- Provides everything necessary for verification
- Canonicalization method, signature method
- Source, digest method, and digest for each resource
- Signed digest of all of the above
- Certificate (verifiable) with public key
XML Canonicalization
- Digital signature guarantees data is unchanged
- Easy for fixed data, but can be problematic for XML
- Parsing and serializing document may change text
- Attribute order may be different
- Whitespace and line endings can be different, etc.
- Canonicalization gives unique text
- Chooses how to do serialization (such as attribute order)
- Not intended as a general "preferred" serialization
- Standard signatures can apply to canonical text
XML Encryption
- XML Encryption can be applied to any data
- Data can be embedded within encryption element
- Encryption element can reference the raw data
- Encryption can be nested
- Embedded encryption useful for controlled access
- Direct recipient may not need all particulars of data
- Key "handle" information can be included
- Not actual key, since symmetrical encryption used
WS-Security importance
- Small near-term, big for future
- Use for specialized needs (intermediaries, etc.)
- Possibly use for custom tokens (encrypted user name and password) as Header fields
- Use basic SSL where you can until WS-Security is accepted and widely supported
- Probably 1-2 years before in general use
- Limited support in current toolkits
- Interoperability possible, but far from automatic
Hardening services
- Hardening a separate issue from securing
- Securing allows only intended uses
- Hardening blocks interference
- Some techniques apply to both
- Access control (via IP address list, certificates, etc.)
- Authentication (to make sure user is valid)
- Some techniques can be at cross-purposes
Security can cause problems
- Consider public order service (e.g., Amazon)
- Need to verify customer identity and order integrity
- Create audit trail for chain of truest
- XML Signature seems ideal
- Overhead of XML Signature creates soft spot
- Easy to overload processing with garbage orders
- Will fail authentication -- but only after cost incurred
- How to avoid weakness?
Securing a retail service
- Best approach multiple layers of authentication
- First layer just a screen (user name, password digest)
- Second layer can be more secure (XML Signature)
- Low overhead to verify first layer, but difficult to attack
- But need to make sure order is as intended!
- Toolkit support can create problems
- Security needs to be under your control, not toolkit's
- Must be able to apply checks selectively and in order
What about UDDI?
- UDDI an automated directory service
- Automatically register your services
- Lookup other services of interest
- Very fun, but "where's the beef?"
- Big gap between WSDL and UDDI functionalities
- No automated way for programs to use information
- Human interpretation required to write connecting code
- Why not just look up via HTML web page?
- Still a solution in search of a problem
When to SOAP?
- SOAP interfaces best for:
- Granular interface (as with remote EJB interfaces)
- Moderate access frequency
- Moderate data volume
- Highly-interoperable applications
- Published service interfaces
- Clients using different platforms or languages
- Not a replacement for Java-specific RMI
- RMI much more flexible and performant than SOAP
Transaction rates
- Transaction rates a core performance issue
- As with RI EJBs, use coarse-grained calls
- Get as much done as feasible in a single call
- Avoid multiple calls to retrieve related values
- If a requirement, look into alternatives
- Commercial frameworks give higher transaction rates
- Custom framework can do at least as well
Data volumes
- Data volumes second core performance issue
- XML representation of data can be bulky
- SOAP (especially encoded) has even more overhead
- Costly both for data size and processing overhead
- XML representation changes can help
- document/literal allows control over XML format
- Attachments best for large blocks
- Binary data sent with minimal overhead
- XML data sent without need for embedding
Conclusion
- SOAP/WSDL Web services are here to stay
- Committed backing by major players
- Allow good interoperability when used properly
- Performance issues can be a concern
- Faster frameworks can help
- Attachments can help
- Still many “Here there be Dragons” on map
- Attachment techniques, security, etc.
- Know which parts are solid and which are sand
References
Questions?
Questions?
Contact me at dms@sosnoski.com
Please fill out and turn in
session evaluation form!