esper.codehaus.org and espertech.comDocumentation

Esper Reference

Version 5.2.0


Preface
1. Technology Overview
1.1. Introduction to CEP and event series analysis
1.2. CEP and relational databases
1.3. The Esper engine for CEP
1.4. Required 3rd Party Libraries
2. Event Representations
2.1. Event Underlying Java Objects
2.2. Event Properties
2.2.1. Escape Characters
2.2.2. Expression as Key or Index Value
2.3. Dynamic Event Properties
2.4. Fragment and Fragment Type
2.5. Plain-Old Java Object Events
2.5.1. Java Object Event Properties
2.5.2. Property Names
2.5.3. Parameterized Types
2.5.4. Setter Methods for Indexed and Mapped Properties
2.5.5. Known Limitations
2.6. java.util.Map Events
2.6.1. Overview
2.6.2. Map Properties
2.6.3. Map Supertypes
2.6.4. Advanced Map Property Types
2.7. Object-array (Object[]) Events
2.7.1. Overview
2.7.2. Object-Array Properties
2.7.3. Object-Array Supertype
2.7.4. Advanced Object-Array Property Types
2.8. org.w3c.dom.Node XML Events
2.8.1. Schema-Provided XML Events
2.8.2. No-Schema-Provided XML Events
2.8.3. Explicitly-Configured Properties
2.9. Additional Event Representations
2.10. Updating, Merging and Versioning Events
2.11. Coarse-Grained Events
2.12. Event Objects Instantiated and Populated by Insert Into
2.13. Comparing Event Representations
3. Processing Model
3.1. Introduction
3.2. Insert Stream
3.3. Insert and Remove Stream
3.4. Filters and Where-clauses
3.5. Time Windows
3.5.1. Time Window
3.5.2. Time Batch
3.6. Batch Windows
3.7. Aggregation and Grouping
3.7.1. Insert and Remove Stream
3.7.2. Output for Aggregation and Group-By
3.8. Event Visibility and Current Time
4. Context and Context Partitions
4.1. Introduction
4.2. Context Declaration
4.2.1. Context-Provided Properties
4.2.2. Keyed Segmented Context
4.2.3. Hash Segmented Context
4.2.4. Category Segmented Context
4.2.5. Non-Overlapping Context
4.2.6. Overlapping Context
4.2.7. Context Conditions
4.3. Context Nesting
4.3.1. Built-In Nested Context Properties
4.4. Partitioning Without Context Declaration
4.5. Output When Context Partition Ends
4.6. Context and Named Window
4.7. Context and Tables
4.8. Context and Variables
4.9. Operations on Specific Context Partitions
5. EPL Reference: Clauses
5.1. EPL Introduction
5.2. EPL Syntax
5.2.1. Specifying Time Periods
5.2.2. Using Comments
5.2.3. Reserved Keywords
5.2.4. Escaping Strings
5.2.5. Data Types
5.2.6. Using Constants and Enum Types
5.2.7. Annotation
5.2.8. Expression Alias
5.2.9. Expression Declaration
5.2.10. Script Declaration
5.2.11. Referring to a Context
5.3. Choosing Event Properties And Events: the Select Clause
5.3.1. Choosing all event properties: select *
5.3.2. Choosing specific event properties
5.3.3. Expressions
5.3.4. Renaming event properties
5.3.5. Choosing event properties and events in a join
5.3.6. Choosing event properties and events from a pattern
5.3.7. Selecting insert and remove stream events
5.3.8. Qualifying property names and stream names
5.3.9. Select Distinct
5.3.10. Transposing an Expression Result to a Stream
5.3.11. Selecting EventBean instead of Underlying Event
5.4. Specifying Event Streams: the From Clause
5.4.1. Filter-based Event Streams
5.4.2. Pattern-based Event Streams
5.4.3. Specifying Views
5.4.4. Multiple Data Window Views
5.4.5. Using the Stream Name
5.5. Specifying Search Conditions: the Where Clause
5.6. Aggregates and grouping: the Group-by Clause and the Having Clause
5.6.1. Using aggregate functions
5.6.2. Organizing statement results into groups: the Group-by clause
5.6.3. Using Group-By with Rollup, Cube and Grouping Sets
5.6.4. Specifying grouping for each aggregation function
5.6.5. Selecting groups of events: the Having clause
5.6.6. How the stream filter, Where, Group By and Having clauses interact
5.6.7. Comparing Keyed Segmented Context, the Group By clause and the std:groupwin view
5.7. Stabilizing and Controlling Output: the Output Clause
5.7.1. Output Clause Options
5.7.2. Aggregation, Group By, Having and Output clause interaction
5.7.3. Runtime Considerations
5.8. Sorting Output: the Order By Clause
5.9. Limiting Row Count: the Limit Clause
5.10. Merging Streams and Continuous Insertion: the Insert Into Clause
5.10.1. Transposing a Property To a Stream
5.10.2. Merging Streams By Event Type
5.10.3. Merging Disparate Types of Events: Variant Streams
5.10.4. Decorated Events
5.10.5. Event as a Property
5.10.6. Instantiating and Populating an Underlying Event Object
5.10.7. Transposing an Expression Result
5.10.8. Select-Clause Expression And Inserted-Into Column Event Type
5.11. Subqueries
5.11.1. The 'exists' Keyword
5.11.2. The 'in' and 'not in' Keywords
5.11.3. The 'any' and 'some' Keywords
5.11.4. The 'all' Keyword
5.11.5. Subquery With Group By Clause
5.11.6. Multi-Column Selection
5.11.7. Multi-Row Selection
5.11.8. Hints Related to Subqueries
5.12. Joining Event Streams
5.12.1. Introducing Joins
5.12.2. Inner (Default) Joins
5.12.3. Outer, Left and Right Joins
5.12.4. Unidirectional Joins
5.12.5. Hints Related to Joins
5.13. Accessing Relational Data via SQL
5.13.1. Joining SQL Query Results
5.13.2. SQL Query and the EPL Where Clause
5.13.3. Outer Joins With SQL Queries
5.13.4. Using Patterns to Request (Poll) Data
5.13.5. Polling SQL Queries via Iterator
5.13.6. JDBC Implementation Overview
5.13.7. Oracle Drivers and No-Metadata Workaround
5.13.8. SQL Input Parameter and Column Output Conversion
5.13.9. SQL Row POJO Conversion
5.14. Accessing Non-Relational Data via Method Invocation
5.14.1. Joining Method Invocation Results
5.14.2. Polling Method Invocation Results via Iterator
5.14.3. Providing the Method
5.14.4. Using a Map Return Type
5.14.5. Using a Object Array Return Type
5.15. Declaring an Event Type: Create Schema
5.15.1. Declare an Event Type by Providing Names and Types
5.15.2. Declare an Event Type by Providing a Class Name
5.15.3. Declare a Variant Stream
5.16. Splitting and Duplicating Streams
5.17. Variables and Constants
5.17.1. Creating Variables: the Create Variable clause
5.17.2. Setting Variable Values: the On Set clause
5.17.3. Using Variables
5.17.4. Object-Type Variables
5.17.5. Class and Event-Type Variables
5.18. Declaring Global Expressions, Aliases And Scripts: Create Expression
5.18.1. Global Expression Aliases
5.18.2. Global Expression Declarations
5.18.3. Global Scripts
5.19. Contained-Event Selection
5.19.1. Select-Clause in a Contained-Event Selection
5.19.2. Where Clause in a Contained-Event Selection
5.19.3. Contained-Event Selection and Joins
5.19.4. Sentence and Word Example
5.19.5. More Examples
5.19.6. Arrays returned by a Contained Expression
5.19.7. Contained-Event Limitations
5.20. Updating an Insert Stream: the Update IStream Clause
5.20.1. Immutability and Updates
5.21. Controlling Event Delivery : The For Clause
6. EPL Reference: Named Windows And Tables
6.1. Overview
6.1.1. Named Window Overview
6.1.2. Table Overview
6.1.3. Comparing Named Windows And Tables
6.2. Named Window Usage
6.2.1. Creating Named Windows: the Create Window clause
6.2.2. Inserting Into Named Windows
6.2.3. Selecting From Named Windows
6.3. Table Usage
6.3.1. Creating Tables: The Create Table clause
6.3.2. Aggregating Into Table Rows: The Into Table clause
6.3.3. Table Column Keyed-Access Expressions
6.3.4. Inserting Into Tables
6.3.5. Selecting From Tables
6.4. Triggered Select: the On Select clause
6.4.1. Notes on On-Select With Named Windows
6.4.2. Notes on On-Select With Tables
6.4.3. On-Select Compared To Join
6.5. Triggered Select+Delete: the On Select Delete clause
6.6. Updating Data: the On Update clause
6.6.1. Notes on On-Update With Named Windows
6.6.2. Notes on On-Update With Tables
6.7. Deleting Data: the On Delete clause
6.7.1. Using Patterns in the On Delete Clause
6.7.2. Notes on On-Delete With Named Windows
6.7.3. Notes on On-Update With Tables
6.8. Triggered Upsert using the On-Merge Clause
6.8.1. Notes on On-Merge With Named Windows
6.8.2. Notes on On-Merge With Tables
6.9. Explicitly Indexing Named Windows and Tables
6.10. Using Fire-And-Forget Queries with Named Windows and Tables
6.10.1. Inserting Data
6.10.2. Updating Data
6.10.3. Deleting Data
6.11. Versioning and Revision Event Type Use with Named Windows
6.12. Events As Property
7. EPL Reference: Patterns
7.1. Event Pattern Overview
7.2. How to use Patterns
7.2.1. Pattern Syntax
7.2.2. Patterns in EPL
7.2.3. Subscribing to Pattern Events
7.2.4. Pulling Data from Patterns
7.2.5. Pattern Error Reporting
7.2.6. Suppressing Same-Event Matches
7.2.7. Discarding Partially Completed Patterns
7.3. Operator Precedence
7.4. Filter Expressions In Patterns
7.4.1. Controlling Event Consumption
7.4.2. Use With Named Windows and Tables
7.5. Pattern Operators
7.5.1. Every
7.5.2. Every-Distinct
7.5.3. Repeat
7.5.4. Repeat-Until
7.5.5. And
7.5.6. Or
7.5.7. Not
7.5.8. Followed-by
7.5.9. Pattern Guards
7.6. Pattern Atoms
7.6.1. Filter Atoms
7.6.2. Observer Atoms Overview
7.6.3. Interval (timer:interval)
7.6.4. Crontab (timer:at)
7.6.5. Schedule (timer:schedule)
8. EPL Reference: Match Recognize
8.1. Overview
8.2. Comparison of Match Recognize and EPL Patterns
8.3. Syntax
8.3.1. Syntax Example
8.4. Pattern and Pattern Operators
8.4.1. Operator Precedence
8.4.2. Concatenation
8.4.3. Alternation
8.4.4. Quantifiers Overview
8.4.5. Permutations
8.4.6. Variables Can be Singleton or Group
8.4.7. Eliminating Duplicate Matches
8.4.8. Greedy Or Reluctant
8.4.9. Quantifier - One Or More (+ and +?)
8.4.10. Quantifier - Zero Or More (* and *?)
8.4.11. Quantifier - Zero Or One (? and ??)
8.4.12. Repetition - Exactly N Matches
8.4.13. Repetition - N Or More Matches
8.4.14. Repetition - Between N and M Matches
8.4.15. Repetition - Between Zero and M Matches
8.4.16. Repetition Equivalence
8.5. Define Clause
8.5.1. The Prev Operator
8.6. Measure Clause
8.7. Datawindow-Bound
8.8. Interval
8.9. Interval-Or-Terminated
8.10. Use with Different Event Types
8.11. Limitations
9. EPL Reference: Operators
9.1. Arithmetic Operators
9.2. Logical And Comparison Operators
9.2.1. Null-Value Comparison Operators
9.3. Concatenation Operators
9.4. Binary Operators
9.5. Array Definition Operator
9.6. Dot Operator
9.6.1. Duck Typing
9.7. The 'in' Keyword
9.7.1. 'in' for Range Selection
9.8. The 'between' Keyword
9.9. The 'like' Keyword
9.10. The 'regexp' Keyword
9.11. The 'any' and 'some' Keywords
9.12. The 'all' Keyword
9.13. The 'new' Keyword
9.13.1. Using 'new' To Populate A Data Structure
9.13.2. Using 'new' To Instantiate An Object
10. EPL Reference: Functions
10.1. Single-row Function Reference
10.1.1. The Case Control Flow Function
10.1.2. The Cast Function
10.1.3. The Coalesce Function
10.1.4. The Current_Evaluation_Context Function
10.1.5. The Current_Timestamp Function
10.1.6. The Exists Function
10.1.7. The Grouping Function
10.1.8. The Grouping_Id Function
10.1.9. The Instance-Of Function
10.1.10. The Istream Function
10.1.11. The Min and Max Functions
10.1.12. The Previous Function
10.1.13. The Previous-Tail Function
10.1.14. The Previous-Window Function
10.1.15. The Previous-Count Function
10.1.16. The Prior Function
10.1.17. The Type-Of Function
10.2. Aggregation Functions
10.2.1. SQL-Standard Functions
10.2.2. Event Aggregation Functions
10.2.3. Approximation Aggregation Functions
10.2.4. Additional Aggregation Functions
10.3. User-Defined Functions
10.4. Select-Clause transpose Function
10.4.1. Transpose with Insert-Into
11. EPL Reference: Enumeration Methods
11.1. Overview
11.2. Example Events
11.3. How to Use
11.3.1. Syntax
11.3.2. Introductory Examples
11.3.3. Input, Output and Limitations
11.4. Inputs
11.4.1. Subquery Results
11.4.2. Named Window
11.4.3. Table
11.4.4. Event Property
11.4.5. Event Aggregation Function
11.4.6. prev, prevwindow and prevtail Single-Row Functions as Input
11.4.7. Single-Row Function, User-Defined Function and Enum Types
11.4.8. Declared Expression
11.4.9. Variables
11.4.10. Substitution Parameters
11.4.11. Match-Recognize Group Variable
11.4.12. Pattern Repeat and Repeat-Until Operators
11.5. Example
11.6. Reference
11.6.1. Aggregate
11.6.2. AllOf
11.6.3. AnyOf
11.6.4. Average
11.6.5. CountOf
11.6.6. DistinctOf
11.6.7. Except
11.6.8. FirstOf
11.6.9. GroupBy
11.6.10. Intersect
11.6.11. LastOf
11.6.12. LeastFrequent
11.6.13. Max
11.6.14. MaxBy
11.6.15. Min
11.6.16. MinBy
11.6.17. MostFrequent
11.6.18. OrderBy and OrderByDesc
11.6.19. Reverse
11.6.20. SelectFrom
11.6.21. SequenceEqual
11.6.22. SumOf
11.6.23. Take
11.6.24. TakeLast
11.6.25. TakeWhile
11.6.26. TakeWhileLast
11.6.27. ToMap
11.6.28. Union
11.6.29. Where
12. EPL Reference: Date-Time Methods
12.1. Overview
12.2. How to Use
12.2.1. Syntax
12.3. Calendar and Formatting Reference
12.3.1. Between
12.3.2. Format
12.3.3. Get (By Field)
12.3.4. Get (By Name)
12.3.5. Minus
12.3.6. Plus
12.3.7. RoundCeiling
12.3.8. RoundFloor
12.3.9. RoundHalf
12.3.10. Set (By Field)
12.3.11. WithDate
12.3.12. WithMax
12.3.13. WithMin
12.3.14. WithTime
12.3.15. ToCalendar
12.3.16. ToDate
12.3.17. ToMillisec
12.4. Interval Algebra Reference
12.4.1. Examples
12.4.2. Interval Algebra Parameters
12.4.3. Performance
12.4.4. Limitations
12.4.5. After
12.4.6. Before
12.4.7. Coincides
12.4.8. During
12.4.9. Finishes
12.4.10. Finished By
12.4.11. Includes
12.4.12. Meets
12.4.13. Met By
12.4.14. Overlaps
12.4.15. Overlapped By
12.4.16. Starts
12.4.17. Started By
13. EPL Reference: Views
13.1. A Note on View Parameters
13.2. Data Window Views
13.2.1. Length window (win:length)
13.2.2. Length batch window (win:length_batch)
13.2.3. Time window (win:time)
13.2.4. Externally-timed window (win:ext_timed)
13.2.5. Time batch window (win:time_batch)
13.2.6. Externally-timed batch window (win:ext_timed_batch)
13.2.7. Time-Length combination batch window (win:time_length_batch)
13.2.8. Time-Accumulating window (win:time_accum)
13.2.9. Keep-All window (win:keepall)
13.2.10. First Length (win:firstlength)
13.2.11. First Time (win:firsttime)
13.2.12. Expiry Expression (win:expr)
13.2.13. Expiry Expression Batch (win:expr_batch)
13.3. Standard view set
13.3.1. Unique (std:unique)
13.3.2. Grouped Data Window (std:groupwin)
13.3.3. Size (std:size)
13.3.4. Last Event (std:lastevent)
13.3.5. First Event (std:firstevent)
13.3.6. First Unique (std:firstunique)
13.4. Statistics views
13.4.1. Univariate statistics (stat:uni)
13.4.2. Regression (stat:linest)
13.4.3. Correlation (stat:correl)
13.4.4. Weighted average (stat:weighted_avg)
13.5. Extension View Set
13.5.1. Sorted Window View (ext:sort)
13.5.2. Ranked Window View (ext:rank)
13.5.3. Time-Order View (ext:time_order)
14. EPL Reference: Data Flow
14.1. Introduction
14.2. Usage
14.2.1. Overview
14.2.2. Syntax
14.3. Built-in Operators
14.3.1. BeaconSource
14.3.2. EPStatementSource
14.3.3. EventBusSink
14.3.4. EventBusSource
14.3.5. Filter
14.3.6. LogSink
14.3.7. Select
14.4. API
14.4.1. Declaring a Data Flow
14.4.2. Instantiating a Data Flow
14.4.3. Executing a Data Flow
14.4.4. Instantiation Options
14.4.5. Start Captive
14.4.6. Data Flow Punctuation with Markers
14.4.7. Exception Handling
14.5. Examples
14.6. Operator Implementation
14.6.1. Sample Operator Acting as Source
14.6.2. Sample Tokenizer Operator
14.6.3. Sample Aggregator Operator
15. API Reference
15.1. API Overview
15.2. The Service Provider Interface
15.3. The Administrative Interface
15.3.1. Creating Statements
15.3.2. Receiving Statement Results
15.3.3. Setting a Subscriber Object
15.3.4. Adding Listeners
15.3.5. Using Iterators
15.3.6. Managing Statements
15.3.7. Atomic Statement Management
15.3.8. Runtime Configuration
15.4. The Runtime Interface
15.4.1. Event Sender
15.4.2. Receiving Unmatched Events
15.5. On-Demand Fire-And-Forget Query Execution
15.5.1. On-Demand Query Single Execution
15.5.2. On-Demand Query Prepared Unparameterized Execution
15.5.3. On-Demand Query Prepared Parameterized Execution
15.6. Event and Event Type
15.6.1. Event Type Metadata
15.6.2. Event Object
15.6.3. Query Example
15.6.4. Pattern Example
15.7. Engine Threading and Concurrency
15.7.1. Advanced Threading
15.7.2. Processing Order
15.8. Controlling Time-Keeping
15.8.1. Controlling Time Using Time Span Events
15.8.2. Additional Time-Related APIs
15.9. Time Resolution
15.10. Service Isolation
15.10.1. Overview
15.10.2. Example: Suspending a Statement
15.10.3. Example: Catching up a Statement from Historical Data
15.10.4. Isolation for Insert-Into
15.10.5. Isolation for Named Windows and Tables
15.10.6. Runtime Considerations
15.11. Exception Handling
15.12. Condition Handling
15.13. Statement Object Model
15.13.1. Building an Object Model
15.13.2. Building Expressions
15.13.3. Building a Pattern Statement
15.13.4. Building a Select Statement
15.13.5. Building a Create-Variable and On-Set Statement
15.13.6. Building Create-Window, On-Delete and On-Select Statements
15.14. Prepared Statement and Substitution Parameters
15.15. Engine and Statement Metrics Reporting
15.15.1. Engine Metrics
15.15.2. Statement Metrics
15.16. Event Rendering to XML and JSON
15.16.1. JSON Event Rendering Conventions and Options
15.16.2. XML Event Rendering Conventions and Options
15.17. Plug-in Loader
15.18. Interrogating EPL Annotations
15.19. Context Partition Selection
15.19.1. Selectors
15.20. Context Partition Administration
15.21. Test and Assertion Support
15.21.1. EPAssertionUtil Summary
15.21.2. SupportUpdateListener Summary
15.21.3. Usage Example
16. Configuration
16.1. Programmatic Configuration
16.2. Configuration via XML File
16.3. XML Configuration File
16.4. Configuration Items
16.4.1. Events represented by Java Classes
16.4.2. Events represented by java.util.Map
16.4.3. Events represented by Object[] (Object-array)
16.4.4. Events represented by org.w3c.dom.Node
16.4.5. Events represented by Plug-in Event Representations
16.4.6. Class and package imports
16.4.7. Cache Settings for From-Clause Method Invocations
16.4.8. Variables
16.4.9. Relational Database Access
16.4.10. Engine Settings related to Concurrency and Threading
16.4.11. Engine Settings related to Event Metadata
16.4.12. Engine Settings related to View Resources
16.4.13. Engine Settings related to Logging
16.4.14. Engine Settings related to Variables
16.4.15. Engine Settings related to Patterns
16.4.16. Engine Settings related to Scripts
16.4.17. Engine Settings related to Stream Selection
16.4.18. Engine Settings related to Time Source
16.4.19. Engine Settings related to JMX Metrics
16.4.20. Engine Settings related to Metrics Reporting
16.4.21. Engine Settings related to Language and Locale
16.4.22. Engine Settings related to Expression Evaluation
16.4.23. Engine Settings related to Execution of Statements
16.4.24. Engine Settings related to Exception Handling
16.4.25. Engine Settings related to Condition Handling
16.4.26. Revision Event Type
16.4.27. Variant Stream
16.5. Type Names
16.6. Runtime Configuration
16.7. Logging Configuration
16.7.1. Log4j Logging Configuration
17. Development Lifecycle
17.1. Authoring
17.2. Testing
17.3. Debugging
17.3.1. @Audit Annotation
17.4. Packaging and Deploying Overview
17.5. EPL Modules
17.6. The Deployment Administrative Interface
17.6.1. Reading Module Content
17.6.2. Ordering Multiple Modules
17.6.3. Deploying and Undeploying
17.6.4. Listing Deployments
17.6.5. State Transitioning a Module
17.6.6. Best Practices
17.7. J2EE Packaging and Deployment
17.7.1. J2EE Deployment Considerations
17.7.2. Servlet Context Listener
17.8. Monitoring and JMX
18. Integration and Extension
18.1. Overview
18.2. Virtual Data Window
18.2.1. How to Use
18.2.2. Implementing the Factory
18.2.3. Implementing the Virtual Data Window
18.2.4. Implementing the Lookup
18.3. Single-Row Function
18.3.1. Implementing a Single-Row Function
18.3.2. Configuring the Single-Row Function Name
18.3.3. Value Cache
18.3.4. Single-Row Functions in Filter Predicate Expressions
18.3.5. Single-Row Functions Taking Events as Parameters
18.3.6. Receiving a Context Object
18.3.7. Exception Handling
18.4. Derived-value and Data Window View
18.4.1. Implementing a View Factory
18.4.2. Implementing a View
18.4.3. View Contract
18.4.4. Configuring View Namespace and Name
18.4.5. Requirement for Data Window Views
18.4.6. Requirement for Derived-Value Views
18.4.7. Requirement for Grouped Views
18.5. Aggregation Function
18.5.1. Aggregation Single-Function Development
18.5.2. Aggregation Multi-Function Development
18.6. Pattern Guard
18.6.1. Implementing a Guard Factory
18.6.2. Implementing a Guard Class
18.6.3. Configuring Guard Namespace and Name
18.7. Pattern Observer
18.7.1. Implementing an Observer Factory
18.7.2. Implementing an Observer Class
18.7.3. Configuring Observer Namespace and Name
18.8. Event Type And Event Object
18.8.1. How It Works
18.8.2. Steps
18.8.3. URI-based Resolution
18.8.4. Example
19. Script Support
19.1. Overview
19.2. Syntax
19.3. Examples
19.4. Built-In EPL Script Attributes
19.5. Performance Notes
19.6. Additional Notes
20. Examples, Tutorials, Case Studies
20.1. Examples Overview
20.2. Running the Examples
20.3. AutoID RFID Reader
20.4. Runtime Configuration
20.5. JMS Server Shell and Client
20.5.1. Overview
20.5.2. JMS Messages as Events
20.5.3. JMX for Remote Dynamic Statement Management
20.6. Market Data Feed Monitor
20.6.1. Input Events
20.6.2. Computing Rates Per Feed
20.6.3. Detecting a Fall-off
20.6.4. Event generator
20.7. OHLC Plug-in View
20.8. Transaction 3-Event Challenge
20.8.1. The Events
20.8.2. Combined event
20.8.3. Real time summary data
20.8.4. Find problems
20.8.5. Event generator
20.9. Self-Service Terminal
20.9.1. Events
20.9.2. Detecting Customer Check-in Issues
20.9.3. Absence of Status Events
20.9.4. Activity Summary Data
20.9.5. Sample Application for J2EE Application Server
20.10. Assets Moving Across Zones - An RFID Example
20.11. StockTicker
20.12. MatchMaker
20.13. Named Window Query
20.14. Sample Virtual Data Window
20.15. Sample Cycle Detection
20.16. Quality of Service
20.17. Trivia Geeks Club
21. Performance
21.1. Performance Results
21.2. Performance Tips
21.2.1. Understand how to tune your Java virtual machine
21.2.2. Input and Output Bottlenecks
21.2.3. Theading
21.2.4. Select the underlying event rather than individual fields
21.2.5. Prefer stream-level filtering over where-clause filtering
21.2.6. Reduce the use of arithmetic in expressions
21.2.7. Remove Unneccessary Constructs
21.2.8. End Pattern Sub-Expressions
21.2.9. Consider using EventPropertyGetter for fast access to event properties
21.2.10. Consider casting the underlying event
21.2.11. Turn off logging and audit
21.2.12. Disable view sharing
21.2.13. Tune or disable delivery order guarantees
21.2.14. Use a Subscriber Object to Receive Events
21.2.15. Consider Data Flows
21.2.16. High-Arrival-Rate Streams and Single Statements
21.2.17. Subqueries versus Joins And Where-clause And Data Windows
21.2.18. Patterns and Pattern Sub-Expression Instances
21.2.19. Pattern Sub-Expression Instance Versus Data Window Use
21.2.20. The Keep-All Data Window
21.2.21. Statement Design for Reduced Memory Consumption - Diagnosing OutOfMemoryError
21.2.22. Performance, JVM, OS and hardware
21.2.23. Consider using Hints
21.2.24. Optimizing Stream Filter Expressions
21.2.25. Statement and Engine Metric Reporting
21.2.26. Expression Evaluation Order and Early Exit
21.2.27. Large Number of Threads
21.2.28. Filter Evaluation Tuning
21.2.29. Context Partition Related Information
21.2.30. Prefer Constant Variables over Non-Constant Variables
21.2.31. Prefer Object-array Events
21.2.32. Composite or Compound Keys
21.2.33. Notes on Query Planning
21.2.34. Query Planning Expression Analysis Hints
21.2.35. Query Planning Index Hints
21.2.36. Measuring Throughput
21.2.37. Do not create the same EPL Statement X times
21.2.38. Comparing Single-Threaded and Multi-Threaded Performance
21.2.39. Incremental Versus Recomputed Aggregation for Named Window Events
21.2.40. When Does Memory Get Released
21.2.41. Measure throughput of non-matches as well as matches
21.3. Using the performance kit
21.3.1. How to use the performance kit
21.3.2. How we use the performance kit
22. References
22.1. Reference List
A. Output Reference and Samples
A.1. Introduction and Sample Data
A.2. Output for Un-aggregated and Un-grouped Queries
A.2.1. No Output Rate Limiting
A.2.2. Output Rate Limiting - Default
A.2.3. Output Rate Limiting - Last
A.2.4. Output Rate Limiting - First
A.2.5. Output Rate Limiting - Snapshot
A.3. Output for Fully-aggregated and Un-grouped Queries
A.3.1. No Output Rate Limiting
A.3.2. Output Rate Limiting - Default
A.3.3. Output Rate Limiting - Last
A.3.4. Output Rate Limiting - First
A.3.5. Output Rate Limiting - Snapshot
A.4. Output for Aggregated and Un-grouped Queries
A.4.1. No Output Rate Limiting
A.4.2. Output Rate Limiting - Default
A.4.3. Output Rate Limiting - Last
A.4.4. Output Rate Limiting - First
A.4.5. Output Rate Limiting - Snapshot
A.5. Output for Fully-aggregated and Grouped Queries
A.5.1. No Output Rate Limiting
A.5.2. Output Rate Limiting - Default
A.5.3. Output Rate Limiting - All
A.5.4. Output Rate Limiting - Last
A.5.5. Output Rate Limiting - First
A.5.6. Output Rate Limiting - Snapshot
A.6. Output for Aggregated and Grouped Queries
A.6.1. No Output Rate Limiting
A.6.2. Output Rate Limiting - Default
A.6.3. Output Rate Limiting - All
A.6.4. Output Rate Limiting - Last
A.6.5. Output Rate Limiting - First
A.6.6. Output Rate Limiting - Snapshot
A.7. Output for Fully-Aggregated, Grouped Queries With Rollup
A.7.1. No Output Rate Limiting
A.7.2. Output Rate Limiting - Default
A.7.3. Output Rate Limiting - All
A.7.4. Output Rate Limiting - Last
A.7.5. Output Rate Limiting - First
A.7.6. Output Rate Limiting - Snapshot
B. Reserved Keywords
Index

Analyzing and reacting to information in real-time oftentimes requires the development of custom applications. Typically these applications must obtain the data to analyze, filter data, derive information and then indicate this information through some form of presentation or communication. Data may arrive with high frequency requiring high throughput processing. And applications may need to be flexible and react to changes in requirements while the data is processed. Esper is an event stream processor that aims to enable a short development cycle from inception to production for these types of applications.

This document is a resource for software developers who develop event driven applications. It also contains information that is useful for business analysts and system architects who are evaluating Esper.

It is assumed that the reader is familiar with the Java programming language.

This document is relevant in all phases of your software development project: from design to deployment and support.

If you are new to Esper, please follow these steps:

  1. Read the tutorials, case studies and solution patterns available on the Esper public web site at http://esper.codehaus.org

  2. Read Section 1.1, “Introduction to CEP and event series analysis” if you are new to CEP and ESP (complex event processing, event stream processing)

  3. Read Chapter 2, Event Representations that explains the different ways of representing events to Esper

  4. Read Chapter 3, Processing Model to gain insight into EPL continuous query results

  5. Read Section 5.1, “EPL Introduction” for an introduction to event stream processing via EPL

  6. Read Section 7.1, “Event Pattern Overview” for an overview over event patterns

  7. Read Section 8.1, “Overview” for an overview over event patterns using the match recognize syntax.

  8. Then glance over the examples Section 20.1, “Examples Overview”

  9. Finally to test drive Esper performance, read Chapter 21, Performance

This section outlines the different means to model and represent events.

Esper uses the term event type to describe the type information available for an event representation.

Your application may configure predefined event types at startup time or dynamically add event types at runtime via API or EPL syntax. See Section 16.4, “Configuration Items” for startup-time configuration and Section 15.3.8, “Runtime Configuration” for the runtime configuration API.

The EPL create schema syntax allows declaring an event type at runtime using EPL, see Section 5.15, “Declaring an Event Type: Create Schema”.

In Section 15.6, “Event and Event Type” we explain how an event type becomes visible in EPL statements and output events delivered by the engine.

An event is an immutable record of a past occurrence of an action or state change. Event properties capture the state information for an event.

In Esper, an event can be represented by any of the following underlying Java objects:


Esper provides multiple choices for representing an event. There is no absolute need for you to create new Java classes to represent an event.

Event representations have the following in common:

  • All event representations support nested, indexed and mapped properties (aka. property expression), as explained in more detail below. There is no limitation to the nesting level.

  • All event representations provide event type metadata. This includes type metadata for nested properties.

  • All event representations allow transposing the event itself and parts of all of its property graph into new events. The term transposing refers to selecting the event itself or event properties that are themselves nestable property graphs, and then querying the event's properties or nested property graphs in further statements. The Apache Axiom event representation is an exception and does not currently allow transposing event properties but does allow transposing the event itself.

  • The Java object, Map and Object-array representations allow supertypes.

The API behavior for all event representations is the same, with minor exceptions noted in this chapter.

The benefits of multiple event representations are:

  • For applications that already have events in one of the supported representations, there is no need to transform events into a Java object before processing.

  • Event representations are exchangeable, reducing or eliminating the need to change statements when the event representation changes.

  • Event representations are interoperable, allowing all event representations to interoperate in same or different statements.

  • The choice makes its possible to consciously trade-off performance, ease-of-use, the ability to evolve and effort needed to import or externalize events and use existing event type metadata.

Event properties capture the state information for an event. Event properties be simple as well as indexed, mapped and nested event properties. The table below outlines the different types of properties and their syntax in an event expression. This syntax allows statements to query deep JavaBean objects graphs, XML structures and Map events.


Combinations are also possible. For example, a valid combination could be person.address('home').street[0].

You may use any expression as a mapped property key or indexed property index by putting the expression within parenthesis after the mapped or index property name. Please find examples below.

The key or index expression must be placed in parenthesis. When using an expression as key for a mapped property, the expression must return a String-typed value. When using an expression as index for an indexed property, the expression must return an int-typed value.

This example below uses Java classes to illustrate;The same principles apply to all event representations.

Assume a class declares these properties (getters not shown for brevity):

public class MyEventType {
  String myMapKey;
  int myIndexValue;
  int myInnerIndexValue;
  Map<String, InnerType> innerTypesMap;	// mapped property
  InnerType[] innerTypesArray; // indexed property
}

public class InnerType {
  String name;
  int[] ids;
}

A sample EPL statement demonstrating expressions as map keys or indexes is:

select innerTypesMap('somekey'),  // returns map value for 'somekey'
  innerTypesMap(myMapKey),        // returns map value for myMapKey value (an expression)
  innerTypesArray[1],             // returns array value at index 1
  innerTypesArray(myIndexValue)   // returns array value at index myIndexValue (an expression)
  from MyEventType

The dot-operator can be used to access methods on the value objects returned by the mapped or indexed properties. By using the dot-operator the syntax follows the chained method invocation described at Section 9.6, “Dot Operator”.

A sample EPL statement demonstrating the dot-operator as well as expressions as map keys or indexes is:

 select innerTypesMap('somekey').ids[1],
  innerTypesMap(myMapKey).getIds(myIndexValue),
  innerTypesArray[1].ids[2],
  innerTypesArray(myIndexValue).getIds(myInnerIndexValue)
  from MyEventType

Please note the following limitations:

  • The square brackets-syntax for indexed properties does now allow expressions and requires a constant index value.

  • When using the dot-operator with mapped or indexed properties that have expressions as map keys or indexes you must follow the chained method invocation syntax.

Dynamic (unchecked) properties are event properties that need not be known at statement compilation time. Such properties are resolved during runtime: they provide duck typing functionality.

The idea behind dynamic properties is that for a given underlying event representation we don't always know all properties in advance. An underlying event may have additional properties that are not known at statement compilation time, that we want to query on. The concept is especially useful for events that represent rich, object-oriented domain models.

The syntax of dynamic properties consists of the property name and a question mark. Indexed, mapped and nested properties can also be dynamic properties:


Dynamic properties always return the java.lang.Object type. Also, dynamic properties return a null value if the dynamic property does not exist on events processed at runtime.

As an example, consider an OrderEvent event that provides an "item" property. The "item" property is of type Object and holds a reference to an instance of either a Service or Product.

Assume that both Service and Product classes provide a property named "price". Via a dynamic property we can specify a query that obtains the price property from either object (Service or Product):

select item.price? from OrderEvent

As a second example, assume that the Service class contains a "serviceName" property that the Product class does not possess. The following query returns the value of the "serviceName" property for Service objects. It returns a null-value for Product objects that do not have the "serviceName" property:

select item.serviceName? from OrderEvent

Consider the case where OrderEvent has multiple implementation classes, some of which have a "timestamp" property. The next query returns the timestamp property of those implementations of the OrderEvent interface that feature the property:

select timestamp? from OrderEvent

The query as above returns a single column named "timestamp?" of type Object.

When dynamic properties are nested, then all properties under the dynamic property are also considered dynamic properties. In the below example the query asks for the "direction" property of the object returned by the "detail" dynamic property:

select detail?.direction from OrderEvent

Above is equivalent to:

select detail?.direction? from OrderEvent

The functions that are often useful in conjunction with dynamic properties are:

  • The cast function casts the value of a dynamic property (or the value of an expression) to a given type.

  • The exists function checks whether a dynamic property exists. It returns true if the event has a property of that name, or false if the property does not exist on that event.

  • The instanceof function checks whether the value of a dynamic property (or the value of an expression) is of any of the given types.

  • The typeof function returns the string type name of a dynamic property.

Dynamic event properties work with all event representations outlined next: Java objects, Map-based, Object-array-based and XML DOM-based events.

Plain-old Java object events are object instances that expose event properties through JavaBeans-style getter methods. Events classes or interfaces do not have to be fully compliant to the JavaBean specification; however for the Esper engine to obtain event properties, the required JavaBean getter methods must be present or an accessor-style and accessor-methods may be defined via configuration.

Esper supports JavaBeans-style event classes that extend a superclass or implement one or more interfaces. Also, Esper event pattern and EPL statements can refer to Java interface classes and abstract classes.

Classes that represent events should be made immutable. As events are recordings of a state change or action that occurred in the past, the relevant event properties should not be changeable. However this is not a hard requirement and the Esper engine accepts events that are mutable as well.

The hashCode and equals methods do not need to be implemented. The implementation of these methods by a Java event class does not affect the behavior of the engine in any way.

Please see Chapter 16, Configuration on options for naming event types represented by Java object event classes. Java classes that do not follow JavaBean conventions, such as legacy Java classes that expose public fields, or methods not following naming conventions, require additional configuration. Via configuration it is also possible to control case sensitivity in property name resolution. The relevant section in the chapter on configuration is Section 16.4.1.3, “Non-JavaBean and Legacy Java Event Classes”.

As outlined earlier, the different property types are supported by the standard JavaBeans specification, and some of which are uniquely supported by Esper:

Assume there is an NewEmployeeEvent event class as shown below. The mapped and indexed properties in this example return Java objects but could also return Java language primitive types (such as int or String). The Address object and Employee can themselves have properties that are nested within them, such as a street name in the Address object or a name of the employee in the Employee object.

public class NewEmployeeEvent {
	public String getFirstName();
	public Address getAddress(String type);
	public Employee getSubordinate(int index);
	public Employee[] getAllSubordinates();
}

Simple event properties require a getter-method that returns the property value. In this example, the getFirstName getter method returns the firstName event property of type String.

Indexed event properties require either one of the following getter-methods. A method that takes an integer-type key value and returns the property value, such as the getSubordinate method, or a method that returns an array-type, or a class that implements Iterable. An example is the getAllSubordinates getter method, which returns an array of Employee but could also return an Iterable. In an EPL or event pattern statement, indexed properties are accessed via the property[index] syntax.

Mapped event properties require a getter-method that takes a String-typed key value and returns the property value, such as the getAddress method. In an EPL or event pattern statement, mapped properties are accessed via the property('key') syntax.

Nested event properties require a getter-method that returns the nesting object. The getAddress and getSubordinate methods are mapped and indexed properties that return a nesting object. In an EPL or event pattern statement, nested properties are accessed via the property.nestedProperty syntax.

All event pattern and EPL statements allow the use of indexed, mapped and nested properties (or a combination of these) anywhere where one or more event property names are expected. The below example shows different combinations of indexed, mapped and nested properties in filters of event pattern expressions (each line is a separate EPL statement):

every NewEmployeeEvent(firstName='myName')
every NewEmployeeEvent(address('home').streetName='Park Avenue')
every NewEmployeeEvent(subordinate[0].name='anotherName')
every NewEmployeeEvent(allSubordinates[1].name='thatName')
every NewEmployeeEvent(subordinate[0].address('home').streetName='Water Street')

Similarly, the syntax can be used in EPL statements in all places where an event property name is expected, such as in select lists, where-clauses or join criteria.

select firstName, address('work'), subordinate[0].name, subordinate[1].name
from NewEmployeeEvent
where address('work').streetName = 'Park Ave'

Events can also be represented by objects that implement the java.util.Map interface. Event properties of Map events are the values in the map accessible through the get method exposed by the java.util.Map interface.

Similar to the Object-array event type, the Map event type takes part in the comprehensive type system that can eliminate the need to use Java classes as event types, thereby making it easier to change types at runtime or generate type information from another source.

A given Map event type can have one or more supertypes that must also be Map event types. All properties available on any of the Map supertypes are available on the type itself. In addition, anywhere within EPL that an event type name of a Map supertype is used, any of its Map subtypes and their subtypes match that expression.

Your application can add properties to an existing Map event type during runtime using the configuration operation updateMapEventType. Properties may not be updated or deleted - properties can only be added, and nested properties can be added as well. The runtime configuration also allows removing Map event types and adding them back with new type information.

After your application configures a Map event type by providing a type name, the type name can be used when defining further Map or Object-array event types by specifying the type name as a property type or an array property type.

One-to-Many relationships in Map event types are represented via arrays. A property in a Map event type may be an array of primitive, an array of Java object, an array of Map or an an array of Object-array.

The engine can process java.util.Map events via the sendEvent(Map map, String eventTypeName) method on the EPRuntime interface. Entries in the Map represent event properties. Keys must be of type java.util.String for the engine to be able to look up event property names specified by pattern or EPL statements.

The engine does not validate Map event property names or values. Your application should ensure that objects passed in as event properties match the create schema property names and types, or the configured event type information when using runtime or static configuration.

Map event properties can be of any type. Map event properties that are Java application objects or that are of type java.util.Map (or arrays thereof) or that are of type Object[] (object-array) (or arrays thereof) offer additional power:

In order to use Map events, the event type name and property names and types must be made known to the engine via Configuration or create schema EPL syntax. Please see examples in Section 5.15, “Declaring an Event Type: Create Schema” and Section 16.4.2, “Events represented by java.util.Map”.

The code snippet below defines a Map event type, creates a Map event and sends the event into the engine. The sample defines the CarLocUpdateEvent event type via runtime configuration interface (create schema or static configuration could have been used instead).

// Define CarLocUpdateEvent event type (example for runtime-configuration interface)
Map<String, Object> def = new HashMap<String, Object>;
def.put("carId", String.class);
def.put("direction", int.class);

epService.getEPAdministrator().getConfiguration().
  addEventType("CarLocUpdateEvent", def);

The CarLocUpdateEvent can now be used in a statement:

select carId from CarLocUpdateEvent.win:time(1 min) where direction = 1
// Create a CarLocUpdateEvent event and send it into the engine for processing
Map<String, Object> event = new HashMap<String, Object>();
event.put("carId", carId);
event.put("direction", direction);

epRuntime.sendEvent(event, "CarLocUpdateEvent");

The engine can also query Java objects as values in a Map event via the nested property syntax. Thus Map events can be used to aggregate multiple data structures into a single event and query the composite information in a convenient way. The example below demonstrates a Map event with a transaction and an account object.

Map event = new HashMap();
event.put("txn", txn);
event.put("account", account);
epRuntime.sendEvent(event, "TxnEvent");

An example statement could look as follows.

select account.id, account.rate * txn.amount 
from TxnEvent.win:time(60 sec) 
group by account.id

Your Map event type may declare one or more supertypes when configuring the type at engine initialization time or at runtime through the administrative interface.

Supertypes of a Map event type must also be Map event types. All property names and types of a supertype are also available on a subtype and override such same-name properties of the subtype. In addition, anywhere within EPL that an event type name of a Map supertype is used, any of its Map subtypes also matches that expression (similar to the concept of interface in Java).

This example assumes that the BaseUpdate event type has been declared and acts as a supertype to the AccountUpdate event type (both Map event types):

epService.getEPAdministrator().getConfiguration().
    addEventType("AccountUpdate", accountUpdateDef, 
    new String[] {"BaseUpdate"});

Your application EPL statements may select BaseUpdate events and receive both BaseUpdate and AccountUpdate events, as well as any other subtypes of BaseUpdate and their subtypes.

// Receive BaseUpdate and any subtypes including subtypes of subtypes
select * from BaseUpdate

Your application Map event type may have multiple supertypes. The multiple inheritance hierarchy between Maps can be arbitrarily deep, however cyclic dependencies are not allowed. If using runtime configuration, supertypes must exist before a subtype to a supertype can be added.

See Section 16.4.2, “Events represented by java.util.Map” for more information on configuring Map event types.

Strongly-typed nested Map-within-Map events can be used to build rich, type-safe event types on the fly. Use the addEventType method on Configuration or ConfigurationOperations for initialization-time and runtime-time type definition, or the create schema EPL syntax.

Noteworthy points are:

For demonstration, in this example our top-level event type is an AccountUpdate event, which has an UpdatedFieldType structure as a property. Inside the UpdatedFieldType structure the example defines various fields, as well as a property by name 'history' that holds a JavaBean class UpdateHistory to represent the update history for the account. The code snippet to define the event type is thus:

Map<String, Object> updatedFieldDef = new HashMap<String, Object>();
updatedFieldDef.put("name", String.class);
updatedFieldDef.put("addressLine1", String.class);
updatedFieldDef.put("history", UpdateHistory.class);
epService.getEPAdministrator().getConfiguration().
    addEventType("UpdatedFieldType", updatedFieldDef);

Map<String, Object> accountUpdateDef = new HashMap<String, Object>();
accountUpdateDef.put("accountId", long.class);
accountUpdateDef.put("fields", "UpdatedFieldType");	
// the latter can also be:  accountUpdateDef.put("fields", updatedFieldDef);

epService.getEPAdministrator().getConfiguration().
    addEventType("AccountUpdate", accountUpdateDef);

The next code snippet populates a sample event and sends the event into the engine:

Map<String, Object> updatedField = new HashMap<String, Object>();
updatedField.put("name", "Joe Doe");
updatedField.put("addressLine1", "40 Popular Street");
updatedField.put("history", new UpdateHistory());

Map<String, Object> accountUpdate = new HashMap<String, Object>();
accountUpdate.put("accountId", 10009901);
accountUpdate.put("fields", updatedField);

epService.getEPRuntime().sendEvent(accountUpdate, "AccountUpdate");

Last, a sample query to interrogate AccountUpdate events is as follows:

select accountId, fields.name, fields.addressLine1, fields.history.lastUpdate
from AccountUpdate

An event can also be represented by an array of objects. Event properties of Object[] events are the array element values.

Similar to the Map event type, the object-array event type takes part in the comprehensive type system that can eliminate the need to use Java classes as event types, thereby making it easier to change types at runtime or generate type information from another source.

A given Object-array event type can have only a single supertype that must also be an Object-array event type. All properties available on the Object-array supertype is also available on the type itself. In addition, anywhere within EPL that an event type name of an Object-array supertype is used, any of its Object-array subtypes and their subtypes match that expression.

Your application can add properties to an existing Object-array event type during runtime using the configuration operation updateObjectArrayEventType. Properties may not be updated or deleted - properties can only be added, and nested properties can be added as well. The runtime configuration also allows removing Object-array event types and adding them back with new type information.

After your application configures an Object-array event type by providing a type name, the type name can be used when defining further Object-array or Map event types by specifying the type name as a property type or an array property type.

One-to-Many relationships in Object-array event types are represented via arrays. A property in an Object-array event type may be an array of primitive, an array of Java object, an array of Map or an array of Object-array.

The engine can process Object[] events via the sendEvent(Object[] array, String eventTypeName) method on the EPRuntime interface. Entries in the Object array represent event properties.

The engine does not validate Object array length or value types. Your application must ensure that Object array values match the declaration of the event type: The type and position of property values must match property names and types in the same exact order and object array length must match the number of properties declared via create schema or the static or runtime configuration.

Object-array event properties can be of any type. Object-array event properties that are Java application objects or that are of type java.util.Map (or arrays thereof) or that are of type Object-array (or arrays thereof) offer additional power:

In order to use Object[] (object-array) events, the event type name and property names and types, in a well-defined order that must match object-array event properties, must be made known to the engine via configuration or create schema EPL syntax. Please see examples in Section 5.15, “Declaring an Event Type: Create Schema” and Section 16.4.3, “Events represented by Object[] (Object-array)”.

The code snippet below defines an Object-array event type, creates an Object-array event and sends the event into the engine. The sample defines the CarLocUpdateEvent event type via the runtime configuration interface (create schema or static configuration could have been used instead).

// Define CarLocUpdateEvent event type (example for runtime-configuration interface)
String[] propertyNames = {"carId", "direction"};   // order is important
Object[] propertyTypes = {String.class, int.class};  // type order matches name order

epService.getEPAdministrator().getConfiguration().
  addEventType("CarLocUpdateEvent", propertyNames, propertyTypes);

The CarLocUpdateEvent can now be used in a statement:

select carId from CarLocUpdateEvent.win:time(1 min) where direction = 1
// Send an event
Object[] event = {carId, direction};
epRuntime.sendEvent(event, "CarLocUpdateEvent");

The engine can also query Java objects as values in an Object[] event via the nested property syntax. Thus Object[] events can be used to aggregate multiple data structures into a single event and query the composite information in a convenient way. The example below demonstrates a Object[] event with a transaction and an account object.

epRuntime.sendEvent(new Object[] {txn, account}, "TxnEvent");

An example statement could look as follows:

select account.id, account.rate * txn.amount 
from TxnEvent.win:time(60 sec) 
group by account.id

Strongly-typed nested Object[]-within-Object[] events can be used to build rich, type-safe event types on the fly. Use the addEventType method on Configuration or ConfigurationOperations for initialization-time and runtime-time type definition, or the create schema EPL syntax.

Noteworthy points are:

For demonstration, in this example our top-level event type is an AccountUpdate event, which has an UpdatedFieldType structure as a property. Inside the UpdatedFieldType structure the example defines various fields, as well as a property by name 'history' that holds a JavaBean class UpdateHistory to represent the update history for the account. The code snippet to define the event type is thus:

String[] propertyNamesUpdField = {"name", "addressLine1", "history"};
Object[] propertyTypesUpdField = {String.class, String.class, UpdateHistory.class};
epService.getEPAdministrator().getConfiguration().
    addEventType("UpdatedFieldType", propertyNamesUpdField, propertyTypesUpdField);

String[] propertyNamesAccountUpdate = {"accountId", "fields"};
Object[] propertyTypesAccountUpdate = {long.class, "UpdatedFieldType"};
epService.getEPAdministrator().getConfiguration().
    addEventType("AccountUpdate", propertyNamesAccountUpdate, propertyTypesAccountUpdate);

The next code snippet populates a sample event and sends the event into the engine:

Object[] updatedField = {"Joe Doe", "40 Popular Street", new UpdateHistory()};
Object[] accountUpdate = {10009901, updatedField};

epService.getEPRuntime().sendEvent(accountUpdate, "AccountUpdate");

Last, a sample query to interrogate AccountUpdate events is as follows:

select accountId, fields.name, fields.addressLine1, fields.history.lastUpdate
from AccountUpdate

Events can be represented as org.w3c.dom.Node instances and send into the engine via the sendEvent method on EPRuntime or via EventSender. Please note that configuration is required so the event type name and root element name is known. See Chapter 16, Configuration.

If a XML schema document (XSD file) can be made available as part of the configuration, then Esper can read the schema and appropriately present event type metadata and validate statements that use the event type and its properties. See Section 2.8.1, “Schema-Provided XML Events”.

When no XML schema document is provided, XML events can still be queried, however the return type and return values of property expressions are string-only and no event type metadata is available other than for explicitly configured properties. See Section 2.8.2, “No-Schema-Provided XML Events”.

In all cases Esper allows you to configure explicit XPath expressions as event properties. You can specify arbitrary XPath functions or expressions and provide a property name and type by which result values will be available for use in EPL statements. See Section 2.8.3, “Explicitly-Configured Properties”.

Nested, mapped and indexed event properties are also supported in expressions against org.w3c.dom.Node events. Thus XML trees can conveniently be interrogated via the property expression syntax.

Only one event type per root element name may be configured. The engine recognizes each event by its root element name or you may use EventSender to send events.

This section uses the following XML document as an example:

<?xml version="1.0" encoding="UTF-8"?>
<Sensor xmlns="SensorSchema">
  <ID>urn:epc:1:4.16.36</ID>
  <Observation Command="READ_PALLET_TAGS_ONLY">
    <ID>00000001</ID>
    <Tag>
      <ID>urn:epc:1:2.24.400</ID>
    </Tag>
    <Tag>
      <ID>urn:epc:1:2.24.401</ID>
    </Tag>
  </Observation>
</Sensor>

The schema for the example is:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:element name="Sensor">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="ID" type="xs:string"/>
        <xs:element ref="Observation" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>

  <xs:element name="Observation">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="ID" type="xs:string"/>
        <xs:element ref="Tag" maxOccurs="unbounded" />
      </xs:sequence>
      <xs:attribute name="Command" type="xs:string" use="required" />
    </xs:complexType>
  </xs:element>

  <xs:element name="Tag">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="ID" type="xs:string"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

If you have a XSD schema document available for your XML events, Esper can interrogate the schema. The benefits are:

The engine reads a XSD schema file from an URL you provide. Make sure files imported by the XSD schema file can also be resolved.

The configuration accepts a schema URL. This is a sample code snippet to determine a schema URL from a file in classpath:

URL schemaURL = this.getClass().getClassLoader().getResource("sensor.xsd");

Here is a sample use of the runtime configuration API, please see Chapter 16, Configuration for further examples.

epService = EPServiceProviderManager.getDefaultProvider();
ConfigurationEventTypeXMLDOM sensorcfg = new ConfigurationEventTypeXMLDOM();
sensorcfg.setRootElementName("Sensor");
sensorcfg.setSchemaResource(schemaURL.toString());
epService.getEPAdministrator().getConfiguration()
    .addEventType("SensorEvent", sensorcfg);

You must provide a root element name. This name is used to look up the event type for the sendEvent(org.w3c.Node node) method. An EventSender is a useful alternative method for sending events if the type lookup based on the root or document element name is not desired.

After adding the event type, you may create statements and send events. Next is a sample statement:

select ID, Observation.Command, Observation.ID, 
  Observation.Tag[0].ID, Observation.Tag[1].ID
from SensorEvent

As you can see from the example above, property expressions can query property values held in the XML document's elements and attributes.

There are multiple ways to obtain a XML DOM document instance from a XML string. The next code snippet shows how to obtain a XML DOM org.w3c.Document instance:

InputSource source = new InputSource(new StringReader(xml));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true);
Document doc = builderFactory.newDocumentBuilder().parse(source);

Send the org.w3c.Node or Document object into the engine for processing:

epService.getEPRuntime().sendEvent(doc);

By setting the xpath-property-expr option the engine rewrites each property expression as an XPath expression, effectively handing the evaluation over to the underlying XPath implementation available from classpath. Most JVM have a built-in XPath implementation and there are also optimized, fast implementations such as Jaxen that can be used as well.

Set the xpath-property-expr option if you need namespace-aware document traversal, such as when your schema mixes several namespaces and element names are overlapping.

The below table samples several property expressions and the XPath expression generated for each, without namespace prefixes to keep the example simple:


For mapped properties that are specified via the syntax name('key'), the algorithm looks for an attribute by name id and generates a XPath expression as mapped[@id='key'].

Finally, here is an example that includes all different types of properties and their XPath expression equivalent in one property expression:

select nested.mapped('key').indexed[1].attribute from MyEvent

The equivalent XPath expression follows, this time including n0 as a sample namespace prefix:

/n0:rootelement/n0:nested/n0:mapped[@id='key']/n0:indexed[position() = 2]/@attribute

When providing a XSD document, the default configuration allows to transpose property values that are themselves complex elements, as defined in the XSD schema, into a new stream. This behavior can be controlled via the flag auto-fragment.

For example, consider the next query:

insert into ObservationStream
select ID, Observation from SensorEvent

The Observation as a property of the SensorEvent gets itself inserted into a new stream by name ObservationStream. The ObservationStream thus consists of a string-typed ID property and a complex-typed property named Observation, as described in the schema.

A further statement can use this stream to query:

select Observation.Command, Observation.Tag[0].ID from ObservationStream

Before continuing the discussion, here is an alternative syntax using the wildcard-select, that is also useful:

insert into TagListStream
select ID as sensorId, Observation.* from SensorEvent

The new TagListStream has a string-typed ID and Command property as well as an array of Tag properties that are complex types themselves as defined in the schema.

Next is a sample statement to query the new stream:

select sensorId, Command, Tag[0].ID from TagListStream

Please note the following limitations:

Esper automatically registers a new event type for transposed properties. It generates the type name of the new XML event type from the XML event type name and the property names used in the expression. The synposis is type_name.property_name[.property_name...]. The type name can be looked up, for example for use with EventSender or can be created in advance.

Regardless of whether or not you provide a XSD schema for the XML event type, you can always fall back to configuring explicit properties that are backed by XPath expressions.

For further documentation on XPath, please consult the XPath standard or other online material. Consider using Jaxen or Apache Axiom, for example, to provide faster XPath evaluation then your Java VM built-in XPath provider may offer.

Part of the extension and plug-in features of Esper is an event representation API. This set of classes allow an application to create new event types and event instances based on information available elsewhere, statically or dynamically at runtime when EPL statements are created. Please see Section 18.8, “Event Type And Event Object” for details.

Creating a plug-in event representation can be useful when your application has existing Java classes that carry event metadata and event property values and your application does not want to (or cannot) extract or transform such event metadata and event data into one of the built-in event representations (POJO Java objects, Map, Object-array or XML DOM).

Further use of a plug-in event representation is to provide a faster or short-cut access path to event data. For example, access to event data stored in a XML format through the Streaming API for XML (StAX) is known to be very efficient. A plug-in event representation can also provide network lookup and dynamic resolution of event type and dynamic sourcing of event instances.

Currently, EsperIO provides the following additional event representations:

  • Apache Axiom: Streaming API for XML (StAX) implementation

Please see the EsperIO documentation for details on the above.

The chapter on Section 18.8, “Event Type And Event Object” explains how to create your own custom event representation.

To summarize, an event is an immutable record of a past occurrence of an action or state change, and event properties contain useful information about an event.

The length of time an event is of interest to the event processing engine (retention time) depends on your EPL statements, and especially the data window, pattern and output rate limiting clauses of your statements.

During the retention time of an event more information about the event may become available, such as additional properties or changes to existing properties. Esper provides three concepts for handling updates to events.

The first means to handle updating events is the update istream clause as further described in Section 5.20, “Updating an Insert Stream: the Update IStream Clause”. It is useful when you need to update events as they enter a stream, before events are evaluated by any particular consuming statement to that stream.

The second means to update events is the on-merge and on-update clauses, for use with tables and named windows only, as further described in Section 6.8, “Triggered Upsert using the On-Merge Clause” and Section 6.6, “Updating Data: the On Update clause”. On-merge is similar to the SQL merge clause and provides what is known as an "Upsert" operation: Update existing events or if no existing event(s) are found then insert a new event, all in one atomic operation provided by a single EPL statement. On-update can be used to update individual properties of rows held in a table or named window.

The third means to handle updating events is the revision event types, for use with named windows only, as further described in Section 6.11, “Versioning and Revision Event Type Use with Named Windows”. With revision event types one can declare, via configuration only, multiple different event types and then have the engine present a merged event type that contains a superset of properties of all merged types, and have the engine merge events as they arrive without additional EPL statements.

Note that patterns do not reflect changes to past events. For the temporal nature of patterns, any changes to events that were observed in the past do not reflect upon current pattern state.

The insert into clause can populate instantiate new instances of Java object events, java.util.Map events and Object[] (object array) events directly from the results of select clause expressions and populate such instances. Simply use the event type name as the stream name in the insert into clause as described in Section 5.10, “Merging Streams and Continuous Insertion: the Insert Into Clause”.

If instead you have an existing instance of a Java object returned by an expression, such as a single-row function or static method invocation for example, you can transpose that expression result object to a stream. This is described further in Section 5.10.7, “Transposing an Expression Result” and Section 10.4, “Select-Clause transpose Function”.

The column names specified in the select and insert into clause must match available writable properties in the event object to be populated (the target event type). The expression result types of any expressions in the select clause must also be compatible with the property types of the target event type.

If populating a POJO-based event type and the class provides a matching constructor, the expression result types of expressions in the select clause must be compatible with the constructor parameters in the order listed by the constructor. The insert into clause column names are not relevant in this case.

Consider the following example statement:

insert into com.mycompany.NewEmployeeEvent 
select fname as firstName, lname as lastName from HRSystemEvent

The above example specifies the fully-qualified class name of NewEmployeeEvent. The engine instantianes NewEmployeeEvent for each result row and populates the firstName and lastName properties of each instance from the result of select clause expressions. The HRSystemEvent in the example is assumed to have lname and fname properties, and either setter-methods and a default constructor, or a matching constructor.

Note how the example uses the as-keyword to assign column names that match the property names of the NewEmployeeEvent target event. If the property names of the source and target events are the same, the as-keyword is not required.

The next example is an alternate form and specifies property names within the insert into clause instead. The example also assumes that NewEmployeeEvent has been defined or imported via configuration since it does not specify the event class package name:

insert into NewEmployeeEvent(firstName, lastName) 
select fname, lname from HRSystemEvent

Finally, this example populates HRSystemEvent events. The example populates the value of a type property where the event has the value 'NEW' and populates a new event object with the value 'HIRED', copying the fname and lname property values to the new event object:

insert into HRSystemEvent 
select fname, lname, 'HIRED' as type from HRSystemEvent(type='NEW')

The matching of the select or insert into-clause column names to target event type's property names is case-sensitive. It is allowed to only populate a subset of all available columns in the target event type. Wildcard (*) is also allowed and copies all fields of the events or multiple events in a join.

For Java object events, your event class must provide setter-methods according to JavaBean conventions or, alternatively, a matching constructor. If the event class provides setter methods the class should also provide a default constructor taking no parameters. If the event class provides a matching constructor there is no need for setter-methods. If your event class does not have a default constructor and setter methods, or a matching constructor, your application may configure a factory method via ConfigurationEventTypeLegacy. If your event class does not have a default constructor and there is no factory method provided, the engine uses in connection with the Oracle JVM the sun.reflect.ReflectionFactory, noting that in this case member variables do not get initialized to assigned defaults.

The engine follows Java standards in terms of widening, performing widening automatically in cases where widening type conversion is allowed without loss of precision, for both boxed and primitive types and including BigInteger and BigDecimal.

When inserting array-typed properties into a Java, Map-type or Object-array underlying event the event definition should declare the target property as an array.

Please note the following limitations:

  • Event types that utilize XML org.w3c.dom.Node underlying event objects cannot be target of an insert into clause.

The Esper processing model is continuous: Update listeners and/or subscribers to statements receive updated data as soon as the engine processes events for that statement, according to the statement's choice of event streams, views, filters and output rates.

As outlined in Chapter 15, API Reference the interface for listeners is com.espertech.esper.client.UpdateListener. Implementations must provide a single update method that the engine invokes when results become available:

A second, strongly-typed and native, highly-performant method of result delivery is provided: A subscriber object is a direct binding of query results to a Java object. The object, a POJO, receives statement results via method invocation. The subscriber class need not implement an interface or extend a superclass. Please see Section 15.3.3, “Setting a Subscriber Object”.

The engine provides statement results to update listeners by placing results in com.espertech.esper.client.EventBean instances. A typical listener implementation queries the EventBean instances via getter methods to obtain the statement-generated results.

The get method on the EventBean interface can be used to retrieve result columns by name. The property name supplied to the get method can also be used to query nested, indexed or array properties of object graphs as discussed in more detail in Chapter 2, Event Representations and Section 15.6, “Event and Event Type”

The getUnderlying method on the EventBean interface allows update listeners to obtain the underlying event object. For wildcard selects, the underlying event is the event object that was sent into the engine via the sendEvent method. For joins and select clauses with expressions, the underlying object implements java.util.Map.

A length window instructs the engine to only keep the last N events for a stream. The next statement applies a length window onto the Withdrawal event stream. The statement serves to illustrate the concept of data window and events entering and leaving a data window:

select * from Withdrawal.win:length(5)

The size of this statement's length window is five events. The engine enters all arriving Withdrawal events into the length window. When the length window is full, the oldest Withdrawal event is pushed out the window. The engine indicates to listeners all events entering the window as new events, and all events leaving the window as old events.

While the term insert stream denotes new events arriving, the term remove stream denotes events leaving a data window, or changing aggregation values. In this example, the remove stream is the stream of Withdrawal events that leave the length window, and such events are posted to listeners as old events.

The next diagram illustrates how the length window contents change as events arrive and shows the events posted to an update listener.


As before, all arriving events are posted as new events to listeners. In addition, when event W1 leaves the length window on arrival of event W6, it is posted as an old event to listeners.

Similar to a length window, a time window also keeps the most recent events up to a given time period. A time window of 5 seconds, for example, keeps the last 5 seconds of events. As seconds pass, the time window actively pushes the oldest events out of the window resulting in one or more old events posted to update listeners.

Filters to event streams allow filtering events out of a given stream before events enter a data window (if there are data windows defined in your query). The statement below shows a filter that selects Withdrawal events with an amount value of 200 or more.

select * from Withdrawal(amount>=200).win:length(5)

With the filter, any Withdrawal events that have an amount of less then 200 do not enter the length window and are therefore not passed to update listeners. Filters are discussed in more detail in Section 5.4.1, “Filter-based Event Streams” and Section 7.4, “Filter Expressions In Patterns”.


The where-clause and having-clause in statements eliminate potential result rows at a later stage in processing, after events have been processed into a statement's data window or other views.

The next statement applies a where-clause to Withdrawal events. Where-clauses are discussed in more detail in Section 5.5, “Specifying Search Conditions: the Where Clause”.

select * from Withdrawal.win:length(5) where amount >= 200

The where-clause applies to both new events and old events. As the diagram below shows, arriving events enter the window however only events that pass the where-clause are handed to update listeners. Also, as events leave the data window, only those events that pass the conditions in the where-clause are posted to listeners as old events.


The where-clause can contain complex conditions while event stream filters are more restrictive in the type of filters that can be specified. The next statement's where-clause applies the ceil function of the java.lang.Math Java library class in the where clause. The insert-into clause makes the results of the first statement available to the second statement:

insert into WithdrawalFiltered select * from Withdrawal where Math.ceil(amount) >= 200
select * from WithdrawalFiltered

In this section we explain the output model of statements employing a time window view and a time batch view.

A time window is a moving window extending to the specified time interval into the past based on the system time. Time windows enable us to limit the number of events considered by a query, as do length windows.

As a practical example, consider the need to determine all accounts where the average withdrawal amount per account for the last 4 seconds of withdrawals is greater then 1000. The statement to solve this problem is shown below.

select account, avg(amount) 
from Withdrawal.win:time(4 sec) 
group by account
having amount > 1000

The next diagram serves to illustrate the functioning of a time window. For the diagram, we assume a query that simply selects the event itself and does not group or filter events.

select * from Withdrawal.win:time(4 sec)

The diagram starts at a given time t and displays the contents of the time window at t + 4 and t + 5 seconds and so on.


The activity as illustrated by the diagram:

  1. At time t + 4 seconds an event W1 arrives and enters the time window. The engine reports the new event to update listeners.

  2. At time t + 5 seconds an event W2 arrives and enters the time window. The engine reports the new event to update listeners.

  3. At time t + 6.5 seconds an event W3 arrives and enters the time window. The engine reports the new event to update listeners.

  4. At time t + 8 seconds event W1 leaves the time window. The engine reports the event as an old event to update listeners.

The built-in data windows that act on batches of events are the win:time_batch and the win:length_batch views, among others. The win:time_batch data window collects events arriving during a given time interval and posts collected events as a batch to listeners at the end of the time interval. The win:length_batch data window collects a given number of events and posts collected events as a batch to listeners when the given number of events has collected.

Related to batch data windows is output rate limiting. While batch data windows retain events the output clause offered by output rate limiting can control or stabilize the rate at which events are output, see Section 5.7, “Stabilizing and Controlling Output: the Output Clause”.

Let's look at how a time batch window may be used:

select account, amount from Withdrawal.win:time_batch(1 sec)

The above statement collects events arriving during a one-second interval, at the end of which the engine posts the collected events as new events (insert stream) to each listener. The engine posts the events collected during the prior batch as old events (remove stream). The engine starts posting events to listeners one second after it receives the first event and thereon.

For statements containing aggregation functions and/or a group by clause, the engine posts consolidated aggregation results for an event batch. For example, consider the following statement:

select sum(amount) as mysum from Withdrawal.win:time_batch(1 sec)

Note that output rate limiting also generates batches of events following the output model as discussed here.

Following SQL (Standard Query Language) standards for queries against relational databases, the presence or absence of aggregation functions and the presence or absence of the group by clause and group_by named parameters for aggregation functions dictates the number of rows posted by the engine to listeners. The next sections outline the output model for batched events under aggregation and grouping. The examples also apply to data windows that don't batch events and post results continously as events arrive or leave data windows. The examples also apply to patterns providing events when a complete pattern matches.

In summary, as in SQL, if your query only selects aggregation values, the engine provides one row of aggregated values. It provides that row every time the aggregation is updated (insert stream), which is when events arrive or a batch of events gets processed, and when the events leave a data window or a new batch of events arrives. The remove stream then consists of prior aggregation values.

Also as in SQL, if your query selects non-aggregated values along with aggregation values in the select clause, the engine provides a row per event. The insert stream then consists of the aggregation values at the time the event arrives, while the remove stream is the aggregation value at the time the event leaves a data window, if any is defined in your query.

EPL allows each aggregation function to specify its own grouping criteria. Please find further information in Section 5.6.4, “Specifying grouping for each aggregation function”.

The documentation provides output examples for query types in Appendix A, Output Reference and Samples, and the next sections outlines each query type.

An event sent by your application or generated by statements is visible to all other statements in the same engine instance. Similarly, current time (the time horizon) moves forward for all statements in the same engine instance. Please see the Chapter 15, API Reference chapter for how to send events and how time moves forward through system time or via simulated time, and the possible threading models.

Within an Esper engine instance you can additionally control event visibility and current time on a statement level, under the term isolated service as described in Section 15.10, “Service Isolation”.

An isolated service provides a dedicated execution environment for one or more statements. Events sent to an isolated service are visible only within that isolated service. In the isolated service you can move time forward at the pace and resolution desired without impacting other statements that reside in the engine runtime or other isolated services. You can move statements between the engine and an isolated service.

This section discusses the notion of context and its role in the Esper event processing language (EPL).

When you look up the word context in a dictionary, you may find: Context is the set of circumstances or facts that surround a particular event, situation, etc..

Context-dependent event processing occurs frequently: For example, consider a requirement that monitors banking transactions. For different customers your analysis considers customer-specific aggregations, patterns or data windows. In this example the context of detection is the customer. For a given customer you may want to analyze the banking transactions of that customer by using aggregations, data windows, patterns including other EPL constructs.

In a second example, consider traffic monitoring to detect speed violations. Assume the speed limit must be enforced only between 9 am and 5 pm. The context of detection is of temporal nature.

A context takes a cloud of events and classifies them into one or more sets. These sets are called context partitions. An event processing operation that is associated with a context operates on each of these context partitions independently. (Credit: Taken from the book "Event Processing in Action" by Opher Etzion and Peter Niblett.)

A context is a declaration of dimension and may thus result in one or more context partitions. In the banking transaction example there the context dimension is the customer and a context partition exists per customer. In the traffic monitoring example there is a single context partition that exists only between 9 am and 5 pm and does not exist outside of that daily time period.

In an event processing glossary you may find the term event processing agent. An EPL statement is an event processing agent. An alternative term for context partition is event processing agent instance.

Esper EPL allows you to declare contexts explicitly, offering the following benefits:

Esper EPL allows you to declare a context explicitly via the create context syntax introduced below.

After you have declared a context, one or more EPL statements can refer to that context by specifying context name. When an EPL statement refers to a context, all EPL-statement related state such as aggregations, patterns or data windows etc. exists once per context partition.

If an EPL statement does not declare a context, it implicitly has a single context partition. The single context partition lives as long as the EPL statement is started and ends when the EPL statement is stopped.

For more information on locking and threading please see Section 15.7, “Engine Threading and Concurrency”. For performance related information please refer to Chapter 21, Performance.

The create context statement declares a context by specifying a context name and context dimension information.

A context declaration by itself does not consume any resources or perform any logic until your application starts at least one statement that refers to that context. Until then the context is inactive and not in use.

When your application creates or starts the first statement that refers to the context, the engine activates the context.

As soon as your application stops or destroys all statements that refer to the context, the context becomes inactive again.

When your application stops or destroys a statement that refers to a context, the context partitions associated to that statement also end (context partitions associated to other started statements live on).

When your application stops or destroys the statement that declared the context and does not also stop or destroy any statements that refer to the context, the context partitions associated to each such statement do not end.

When your application destroys the statement that declared the context and destroys all statements that refer to that context then the engine removes the context declaration entirely.

The create context statement posts no output events to listeners or subscribers and does not return any rows when iterated.

This context assigns events to context partitions based on the values of one or more event properties, using the value of these property(s) as a key that picks a unique context partition directly. Each event thus belongs to exactly one context partition or zero context partitions if the event does not match the optional filter predicate expression(s). Each context partition handles exactly one set of key values.

The syntax for creating a keyed segmented context is as follows:

create context context_name partition [by]
  event_property [and event_property [and ...]] from stream_def
  [, event_property [...] from stream_def]
  [, ...]

The context_name you assign to the context can be any identifier.

Following the context name is one or more lists of event properties and a stream definition for each entry, separated by comma (,).

The event_property is the name(s) of the event properties that provide the value(s) to pick a unique partition. Multiple event property names are separated by the and keyword.

The stream_def is a stream definition which consists of an event type name optionally followed by parenthesis that contains filter expressions. If providing filter expressions, only events matching the provided filter expressions for that event type are considered by context partitions. The name of a named window or table is not allowed.

You may list multiple event properties for each stream definition. You may list multiple stream definitions. Please refer to usage guidelines below when specifying multiple event properties and/or multiple stream definitions.

The next statement creates a context SegmentedByCustomer that considers the value of the custId property of the BankTxn event type to pick the context partition to assign events to:

create context SegmentedByCustomer partition by custId from BankTxn

The following statement refers to the context created as above to compute a total withdrawal amount per account for each customer:

context SegmentedByCustomer
select custId, account, sum(amount) from BankTxn group by account

The following statement refers to the context created as above and detects a withdrawal of more then 400 followed by a second withdrawal of more then 400 that occur within 10 minutes of the first withdrawal, all for the same customer:

context SegmentedByCustomer
select * from pattern [
  every a=BankTxn(amount > 400) -> b=BankTxn(amount > 400) where timer:within(10 minutes)
]

The EPL statement that refers to a keyed segmented context must have at least one filter expression, at any place within the EPL statement that looks for events of any of the event types listed in the context declaration.

For example, the following is not valid:

// Neither LoginEvent nor LogoutEvent are listed in the context declaration
context SegmentedByCustomer
select * from pattern [every a=LoginEvent -> b=LogoutEvent where timer:within(10 minutes)]

If the context declaration lists multiple streams, each event type must be unrelated: You may not list the same event type twice and you may not list a sub- or super-type of any event type already listed.

The following is not a valid declaration since the BankTxn event type is listed twice:

// Not valid
create context SegmentedByCustomer partition by custId from BankTxn, account from BankTxn

If the context declaration lists multiple streams, the number of event properties provided for each event type must also be the same. The value type returned by event properties of each event type must match within the respective position it is listed in, i.e. the first property listed for each event type must have the same type, the second property listed for each event type must have the same type, and so on.

The following is not a valid declaration since the customer id of BankTxn and login time of LoginEvent is not the same type:

// Invalid: Type mismatch between properties
create context SegmentedByCustomer partition by custId from BankTxn, loginTime from LoginEvent

The next statement creates a context SegmentedByCustomer that also considers LoginEvent and LogoutEvent:

create context SegmentedByCustomer partition by 
  custId from BankTxn, loginId from LoginEvent, loginId from LogoutEvent

As you may have noticed, the above example refers to loginId as the event property name for LoginEvent and LogoutEvent events. The assumption is that the loginId event property of the login and logout events has the same type and carries the same exact value as the custId of bank transaction events, thereby allowing all events of the three event types to apply to the same customer-specific context partition.

This context assigns events to context partitions based on result of a hash function and modulo operation. Each event thus belongs to exactly one context partition or zero context partitions if the event does not match the optional filter predicate expression(s). Each context partition handles exactly one result of hash value modulo granularity.

The syntax for creating a hashed segmented context is as follows:

create context context_name coalesce [by] 
  hash_func_name(hash_func_param) from stream_def
  [, hash_func_name(hash_func_param) from stream_def ]
  [, ...]
  granularity granularity_value
  [preallocate]

The context_name you assign to the context can be any identifier.

Following the context name is one or more lists of hash function name and parameters pairs and a stream definition for each entry, separated by comma (,).

The hash_func_name can either be consistent_hash_crc32 or hash_code or a plug-in single-row function. The hash_func_param is a list of parameter expressions.

The stream_def is a stream definition which consists of an event type name optionally followed by parenthesis that contains filter expressions. If providing filter expressions, only events matching the provided filter expressions for that event type are considered by context partitions. The name of a named window or table is not allowed.

You may list multiple stream definitions. Please refer to usage guidelines below when specifying multiple stream definitions.

The granularity is required and is an integer number that defines the maximum number of context partitions. The engine computes hash code modulo granularity hash(params) mod granularity to determine the context partition. When you specify the hash_code function the engine uses the object hash code and the computation is params.hashCode() %granularity.

Since the engine locks on the level of context partition to protect state, the granularity defines the maximum degree of parallelism. For example, a granularity of 1024 means that 1024 context partitions handle events and thus a maximum 1024 threads can process each assigned statement concurrently.

The optional preallocate keyword instructs the engine to allocate all context partitions at once at the time a statement refers to the context. This is beneficial for performance as the engine does not need to determine whether a context partition exists and dynamically allocate, but may require more memory.

The next statement creates a context SegmentedByCustomerHash that considers the CRC-32 hash code of the custId property of the BankTxn event type to pick the context partition to assign events to, with up to 16 different context partitions that are preallocated:

create context SegmentedByCustomerHash
  coalesce by consistent_hash_crc32(custId) from BankTxn granularity 16 preallocate

The following statement refers to the context created as above to compute a total withdrawal amount per account for each customer:

context SegmentedByCustomerHash
select custId, account, sum(amount) from BankTxn group by custId, account

Note that the statement above groups by custId: Since the events for different customer ids can be assigned to the same context partition, it is necessary that the EPL statement also groups by customer id.

The context declaration shown next assumes that the application provides a computeHash single-row function that accepts BankTxn as a parameter, wherein the result of this function must be an integer value that returns the context partition id for each event:

create context MyHashContext
  coalesce by computeHash(*) from BankTxn granularity 16 preallocate

The EPL statement that refers to a hash segmented context must have at least one filter expression, at any place within the EPL statement that looks for events of any of the event types listed in the context declaration.

For example, the following is not valid:

// Neither LoginEvent nor LogoutEvent are listed in the context declaration
context SegmentedByCustomerHash
select * from pattern [every a=LoginEvent -> b=LogoutEvent where timer:within(10 minutes)]

This context assigns events to context partitions based on the values of one or more event properties, using a predicate expression(s) to define context partition membership. Each event can thus belong to zero, one or many context partitions depending on the outcome of the predicate expression(s).

The syntax for creating a category segmented context is as follows:

create context context_name
  group [by] group_expression as category_label
  [, group [by] group_expression as category_label]
  [, ...]
  from stream_def

The context_name you assign to the context can be any identifier.

Following the context name is a list of groups separated by the group keyword. The list of group is followed by the from keyword and a stream definition.

The group_expression is an expression that categorizes events. Each group expression must be followed by the as keyword and a category label which can be any identifier.

Group expressions are predicate expression and must return a Boolean true or false when applied to an event. For a given event, any number of the group expressions may return true thus categories can be overlapping.

The stream_def is a stream definition which consists of an event type name optionally followed by parenthesis that contains filter expressions. If providing filter expressions, only events matching the provided filter expressions for that event type are considered by context partitions.

The next statement creates a context CategoryByTemp that consider the value of the temperature property of the SensorEvent event type to pick context partitions to assign events to:

create context CategoryByTemp
  group temp < 65 as cold,
  group temp between 65 and 85 as normal,
  group temp > 85 as large
  from SensorEvent

The following statement simply counts, for each category, the number of events and outputs the category label and count:

context CategoryByTemp select context.label, count(*) from SensorEvent

You may declare a non-overlapping context that exists once or that repeats in a regular fashion as controlled by start and end conditions. The number of context partitions is always either one or zero: Context partitions do not overlap.

The syntax for creating a non-overlapping context is as follows:

create context context_name
  start (@now | start_condition) 
  end end_condition

The context_name you assign to the context can be any identifier.

Following the context name is the start keyword, either @now or a start_condition, the end keyword and an end_condition.

Both the start (if specified) and end condition can be an event filter, a pattern, a crontab or a time period. The syntax of start and end conditions is described in Section 4.2.7, “Context Conditions”.

Once the start condition occurs, the engine no longer observes the start condition and begins observing the end condition. Once the end condition occurs, the engine observes the start condition again. If you specified @now instead of a start condition, the engine begins observing the end condition instead.

If you specified an event filter as the start condition, then the event also counts towards the statement(s) that refer to that context. If you specified a pattern as the start condition, then the events that may constitute the pattern match can also count towards the statement(s) that refer to the context provided that @inclusive and event tags are both specified (see below).

At the time of context activation when your application creates a statement that utilizes the context, the engine checks whether the start and end condition are crontab expressions. The engine evaluates the start and end crontab expressions and determines whether the current time is a time between start and end. If the current time is between start and end times, the engine allocates the context partition and waits for observing the end time. Otherwise the engine waits to observe the start time and does not allocate a context partition.

The built-in context properties that are available are the same as described in Section 4.2.6.2, “Built-In Context Properties”.

The next statement creates a context NineToFive that declares a daily time period that starts at 9 am and ends at 5 pm:

create context NineToFive start (0, 9, *, *, *) end (0, 17, *, *, *)

The following statement outputs speed violations between 9 am and 5 pm, considering a speed of 100 or greater as a violation:

context NineToFive select * from TrafficEvent(speed >= 100)

The example that follows demonstrates the use of an event filter as the start condition and a pattern as the end condition.

The next statement creates a context PowerOutage that starts when the first PowerOutageEvent event arrives and that ends 5 seconds after a subsequent PowerOnEvent arrives:

create context PowerOutage start PowerOutageEvent end pattern [PowerOnEvent -> timer:interval(5)]

The following statement outputs the temperature during a power outage and for 5 seconds after the power comes on:

context PowerOutage select * from TemperatureEvent

To output only the last value when a context partition ends (terminates, expires), please read on to the description of output rate limiting.

The next statement creates a context Every15Minutes that starts immediately and lasts for 15 minutes, repeatedly allocating a new context partition at the end of 15 minute intervals:

create context Every15Minutes start @now end after 15 minutes

Tip

A non-overlapping context with @now is always-on: A context partition is always allocated at any given point in time. Only if @now is specified will a context partition always exist at any point in time.

Note

If you specified an event filter or pattern as the end condition for a context partition, and statements that refer to the context specify an event filter or pattern that matches the same conditions, use @Priority to instruct the engine whether the context management or the statement evaluation takes priority (see below for configuring prioritized execution).

For example, if your context declaration looks like this:

create context MyCtx start MyStartEvent end MyEndEvent

And a statement managed by the context is this:

context MyCtx select count(*) as cnt from MyEndEvent output when terminated

By using @Priority(1) for create-context and @Priority(0) for the counting statement the counting statement does not count the last MyEndEvent since context partition management takes priority.

By using @Priority(0) for create-context and @Priority(1) for the counting statement the counting statement will count the last MyEndEvent since the statement evaluation takes priority.

This context initiates a new context partition when an initiating condition occurs, and terminates one or more context partitions when the terminating condition occurs. The engine maintains as many context partitions as the initiating condition fired, and discards context partitions that terminate when the termination condition fires.

The syntax for creating an overlapping context is as follows:

create context context_name
  initiated [by] [distinct (distinct_value_expr [,...])] [@now and] initiating_condition
  terminated [by] terminating_condition

The context_name you assign to the context can be any identifier.

Following the context name is the initiated keyword. After the initiated keyword you can optionally specify the distinct keyword and, within parenthesis, list one or more distinct value expressions. After the initiated keyword you can also specify @now and as explained below.

After the initiated keyword you must specify the initiating condition. It follows the terminated keyword followed by the terminating condition.

Both the initiating and terminating condition can be an event filter, a pattern, a crontab or a time period. The syntax of initiating and terminating conditions is described in Section 4.2.7, “Context Conditions”.

If you specified @now and before the initiating condition then the engine initiates a new context partition immediately. The @now is only allowed in conjunction with initiation conditions that specify a pattern, crontab or time period and not with event filters.

If you specified an event filter for the initiating condition, then the event that initiates a new context partition also counts towards the statement(s) that refer to that context. If you specified a pattern to initiate a new context partition, then the events that may constitute the pattern match can also count towards the statement(s) that refer to the context provided that @inclusive and event tags are both specified (see below).

The next statement creates a context CtxTrainEnter that allocates a new context partition when a train enters a station, and that terminates each context partition 5 minutes after the time the context partition was allocated:

create context CtxTrainEnter
  initiated by TrainEnterEvent as te
  terminated after 5 minutes

The context declared above assigns the stream name te. Thereby the initiating event's properties can be accessed, for example, by specifying context.te.trainId.

The following statement detects when a train enters a station as indicated by a TrainEnterEvent, but does not leave the station within 5 minutes as would be indicated by a matching TrainLeaveEvent:

context CtxTrainEnter
select t1 from pattern [
  t1=TrainEnterEvent -> timer:interval(5 min) and not TrainLeaveEvent(trainId = context.te.trainId)
  ]

Since the TrainEnterEvent that initiates a new context partition also counts towards the statement, the first part of the pattern (the t1=TrainEnterEvent) is satisfied by that initiating event.

The next statement creates a context CtxEachMinute that allocates a new context partition immediately and every 1 minute, and that terminates each context partition 1 minute after the time the context partition was allocated:

create context CtxEachMinute
  initiated @now and pattern [every timer:interval(1 minute)]
  terminated after 1 minutes

The statement above specifies @now to instruct the engine to allocate a new context partition immediately as well as when the pattern fires. Without the @now the engine would only allocate a new context partition when the pattern fires after 1 minute and every minute thereafter.

The following statement averages the temperature, starting anew every 1 minute and outputs the aggregate value continuously:

context CtxEachMinute select avg(temp) from SensorEvent

To output only the last value when a context partition ends (terminates, expires), please read on to the description of output rate limiting.

Note

If you specified an event filter or pattern as the termination condition for a context partition, and statements that refer to the context specify an event filter or pattern that matches the same conditions, use @Priority to instruct the engine whether the context management or the statement evaluation takes priority (see below for configuring prioritized execution). See the note above for more information.

Context start/initiating and end/terminating conditions are for use with overlapping and non-overlapping contexts. Any combination of conditions may be specified.

Use the syntax described here to define the stream that starts/initiates a context partition or that ends/terminates a context partition.

The syntax is:

event_stream_name [(filter_criteria)] [as stream_name]

The event_stream_name is either the name of an event type or name of an event stream populated by an insert into statement. The filter_criteria is optional and consists of a list of expressions filtering the events of the event stream, within parenthesis after the event stream name.

Two examples are:

// A non-overlapping context that starts when MyStartEvent arrives and ends when MyEndEvent arrives
create context MyContext start MyStartEvent end MyEndEvent
// An overlapping context where each MyEvent with level greater zero 
// initiates a new context partition that terminates after 10 seconds
create context MyContext initiated MyEvent(level > 0) terminated after 10 seconds

You may correlate the start/initiating and end/terminating streams by providing a stream name following the as keyword, and by referring to that stream name in the filter criteria of the end condition.

Two examples that correlate the start/initiating and end/terminating condition are:

// A non-overlapping context that starts when MyEvent arrives
// and ends when a matching MyEvent arrives (same id)
create context MyContext 
start MyEvent as myevent
end MyEvent(id=myevent.id)
// An overlapping context where each MyInitEvent initiates a new context partition 
// that terminates when a matching MyTermEvent arrives 
create context MyContext 
initiated by MyInitEvent as e1 
terminated by MyTermEvent(id=e1.id, level <> e1.level)

You can define a pattern that starts/initiates a context partition or that ends/terminates a context partition.

The syntax is:

pattern [pattern_expression] [@inclusive]

The pattern_expression is a pattern at Chapter 7, EPL Reference: Patterns.

Specify @inclusive after the pattern to have those same events that constitute the pattern match also count towards any statements that are associated to the context. You must also provide a tag for each event in a pattern that should be included.

Examples are:

// A non-overlapping context that starts when either StartEventOne or StartEventTwo arrive
// and that ends after 5 seconds.
// Here neither StartEventOne or StartEventTwo count towards any statements
// that are referring to the context.
create context MyContext 
  start pattern [StartEventOne or StartEventTwo] 
  end after 5 seconds
// Same as above. 
// Here both StartEventOne or StartEventTwo do count towards any statements
// that are referring to the context.
create context MyContext 
  start pattern [a=StartEventOne or b=StartEventTwo] @inclusive 
  end after 5 seconds
// An overlapping context where each distinct MyInitEvent initiates a new context 
// and each context partition terminates after 20 seconds
// We use @inclusive to say that the same MyInitEvent that fires the pattern
// also applies to statements that are associated to the context.
create context MyContext
  initiated by pattern [every-distinct(a.id, 20 sec) a=MyInitEvent]@inclusive
  terminated after 20 sec
// An overlapping context where each pattern match initiates a new context 
// and all context partitions terminate when MyTermEvent arrives.
// The MyInitEvent and MyOtherEvent that trigger the pattern are themselves not included 
// in any statements that are associated to the context.
create context MyContext
  initiated by pattern [every MyInitEvent -> MyOtherEvent where timer:within(5)]
  terminated by MyTermEvent

You may correlate the start and end streams by providing tags as part of the pattern, and by referring to the tag name(s) in the filter criteria of the end condition.

An example that correlates the start and end condition is:

// A non-overlapping context that starts when either StartEventOne or StartEventTwo arrive
// and that ends when either a matching EndEventOne or EndEventTwo arrive
create context MyContext 
  start pattern [a=StartEventOne or b=StartEventTwo]@inclusive
  end pattern [EndEventOne(id=a.id) or EndEventTwo(id=b.id)]

Crontab expression are described in Section 7.6.4, “Crontab (timer:at)”.

Examples are:

// A non-overlapping context started daily between 9 am to 5 pm
// and not started outside of these hours:
create context NineToFive start (0, 9, *, *, *) end (0, 17, *, *, *)
// An overlapping context where crontab initiates a new context every 1 minute
// and each context partition terminates after 10 seconds:
create context MyContext initiated (*, *, *, *, *) terminated after 10 seconds

You may specify a time period that the engine observes before the condition fires. Time period expressions are described in Section 5.2.1, “Specifying Time Periods”.

The syntax is:

after time_period_expression

Examples are:

// A non-overlapping context started after 10 seconds 
// that ends 1 minute after it starts and that again starts 10 seconds thereafter.
create context NonOverlap10SecFor1Min start after 10 seconds end after 1 minute
// An overlapping context that starts a new context partition every 5 seconds
// and each context partition lasts 1 minute
create context Overlap5SecFor1Min initiated after 5 seconds terminated after 1 minute

A nested context is a context that is composed from two or more contexts.

The syntax for creating a nested context is as follows:

create context context_name
  context nested_context_name [as] nested_context_definition ,
  context nested_context_name [as] nested_context_definition [, ...]

The context_name you assign to the context can be any identifier.

Following the context name is a comma-separated list of nested contexts. For each nested context specify the context keyword followed a nested context name and the nested context declaration. Any of the context declarations as outlined in Section 4.2, “Context Declaration” are allowed for nested contexts. The order of nested context declarations matters as outlined below. The nested context names have meaning only in respect to built-in properties and statements may not be assigned to nested context names.

The next statement creates a nested context NineToFiveSegmented that, between 9 am and 5 pm, allocates a new context partition for each customer id:

create context NineToFiveSegmented
  context NineToFive start (0, 9, *, *, *) end (0, 17, *, *, *),
  context SegmentedByCustomer partition by custId from BankTxn

The following statement refers to the nested context to compute a total withdrawal amount per account for each customer but only between 9 am and 5 pm:

context NineToFiveSegmented
select custId, account, sum(amount) from BankTxn group by account

Esper implements nested contexts as a context tree: The context declared first controls the lifecycle of the context(s) declared thereafter. Thereby, in the above example, outside of the 9am-to-5pm time the engine has no memory and consumes no resources in relationship to bank transactions or customer ids.

When combining segmented contexts, the set of context partitions for the nested context effectively is the Cartesian product of the partition sets of the nested segmented contexts.

When combining temporal contexts with other contexts, since temporal contexts may overlap and may terminate, it is important to understand that temporal contexts control the lifecycle of sub-contexts (contexts declared thereafter). The order of declaration of contexts in a nested context can thereby change resource usage and output result.

The next statement creates a context that allocates context partition only when a train enters a station and then for each hash of the tag id of a passenger as indicated by PassengerScanEvent events, and terminates all context partitions after 5 minutes:

create context CtxNestedTrainEnter
  context InitCtx initiated by TrainEnterEvent as te terminated after 5 minutes,
  context HashCtx coalesce by consistent_hash_crc32(tagId) from PassengerScanEvent
    granularity 16 preallocate

In the example above the engine does not start tracking PassengerScanEvent events or hash codes or allocate context partitions until a TrainEnterEvent arrives.

You may use output rate limiting to trigger output when a context partition ends, as further described in Section 5.7, “Stabilizing and Controlling Output: the Output Clause”.

Consider the fixed temporal context: A new context partition gets allocated at the designated start time and the current context partition ends at the designated end time. To trigger output when the context partition ends and before it gets removed, read on.

The same is true for the initiated temporal context: That context starts a new context partition when trigger events arrive or when a pattern matches. Each context partition expires (ends, terminates) after the specified time period passed. To trigger output at the time the context partition expires, read on.

You may use the when terminated syntax with output rate limiting to trigger output when a context partition ends. The following example demonstrates the idea by declaring an initiated temporal context.

The next statement creates a context CtxEachMinute that initiates a new context partition every 1 minute, and that expires each context partition after 5 minutes:

create context CtxEachMinute
initiated by pattern [every timer:interval(1 min)]
terminated after 5 minutes

The following statement computes an ongoing average temperature however only outputs the last value of the average temperature after 5 minutes when a context partition ends:

context CtxEachMinute
select context.id, avg(temp) from SensorEvent output snapshot when terminated

The when terminated syntax can be combined with other output rates.

The next example outputs every 1 minute and also when the context partition ends:

context CtxEachMinute
select context.id, avg(temp) from SensorEvent output snapshot every 1 minute and when terminated

In the case that the end/terminating condition of the context partition is an event or pattern, the context properties contain the information of the tagged events in the pattern or the single event that ended/terminated the context partition.

For example, consider the following context wherein the engine initializes a new context partition for each arriving MyStartEvent event and that terminates a context partition when a matching MyEndEvent arrives:

create context CtxSample
initiated by MyStartEvent as startevent
terminated by MyEndEvent(id = startevent.id) as endevent

The following statement outputs the id property of the initiating and terminating event and only outputs when a context partition ends:

context CtxSample 
select context.startevent.id, context.endevent.id, count(*) from MyEvent
output snapshot when terminated

You may in addition specify a termination expression that the engine evaluates when a context partition terminates. Only when the terminaton expression evaluates to true does output occur. The expression may refer to built-in properties as described in Section 5.7.1.1, “Controlling Output Using an Expression”. The syntax is as follows:

...output when terminated and termination_expression

The next example statement outputs when a context partition ends but only if at least two events are available for output:

context CtxEachMinute
select * from SensorEvent output when terminated and count_insert >= 2

The final example EPL outputs when a context partition ends and sets the variable myvar to a new value:

context CtxEachMinute
select * from SensorEvent output when terminated then set myvar=3

Named windows are globally-visible data window that may be referred to by multiple statements. You may refer to named windows in statements that declare a context without any special considerations.

You may also create a named window and declare a context for the named window. In this case the engine in effect manages separate named windows, one for each context partition. Limitations apply in this case that we discuss herein.

For example, consider the 9 am to 5 pm fixed temoral context as shown earlier:

create context NineToFive start (0, 9, *, *, *) end (0, 17, *, *, *)

You may create a named window that only exists between 9 am and 5 pm:

context NineToFive create window SpeedingEvents1Hour.win:time(30 min) as TrafficEvent

You can insert into the named window:

insert into SpeedingEvents1Hour select * from TrafficEvent(speed > 100)

Any on-merge, on-select, on-update and on-delete statements must however declare the same context.

The following is not a valid statement as it does not declare the same context that was used to declare the named window:

// You must declare the same context for on-trigger statements
on TruncateEvent delete from SpeedingEvents1Hour

The following is valid:

context NineToFive on TruncateEvent delete from SpeedingEvents1Hour

For context declarations that require specifying event types, such as the hash segmented context and keyed segmented context, please provide the named window underlying event type.

The following sample EPL statements define a type for the named window, declare a context and associate the named window to the context:

create schema ScoreCycle (userId string, keyword string, productId string, score long)
create context HashByUserCtx as 
  coalesce by consistent_hash_crc32(userId) from ScoreCycle granularity 64
context HashByUserCtx create window ScoreCycleWindow.std:unique(productId, keyword) as ScoreCycle

Selecting specific context partitions and interrogating context partitions is useful for:

Esper provides APIs to identify, filter and select context partitions for statement iteration and on-demand queries. The APIs are described in detail at Section 15.19, “Context Partition Selection”.

For statement iteration, your application can provide context selector objects to the iterate and safeIterate methods on EPStatement. If your code does not provide context selectors the iteration considers all context partitions. At the time of iteration, the engine obtains the current set of context partitions and iterates each independently. If your statement has an order-by clause, the order-by clause orders within the context partition and does not order across context partitions.

For on-demand queries, your application can provide context selector objects to the executeQuery method on EPRuntime and to the execute method on EPOnDemandPreparedQuery. If your code does not provide context selectors the on-demand query considers all context partitions. At the time of on-demand query execution, the engine obtains the current set of context partitions and queries each independently. If the on-demand query has an order-by clause, the order-by clause orders within the context partition and does not order across context partitions.

5.1. EPL Introduction
5.2. EPL Syntax
5.2.1. Specifying Time Periods
5.2.2. Using Comments
5.2.3. Reserved Keywords
5.2.4. Escaping Strings
5.2.5. Data Types
5.2.6. Using Constants and Enum Types
5.2.7. Annotation
5.2.8. Expression Alias
5.2.9. Expression Declaration
5.2.10. Script Declaration
5.2.11. Referring to a Context
5.3. Choosing Event Properties And Events: the Select Clause
5.3.1. Choosing all event properties: select *
5.3.2. Choosing specific event properties
5.3.3. Expressions
5.3.4. Renaming event properties
5.3.5. Choosing event properties and events in a join
5.3.6. Choosing event properties and events from a pattern
5.3.7. Selecting insert and remove stream events
5.3.8. Qualifying property names and stream names
5.3.9. Select Distinct
5.3.10. Transposing an Expression Result to a Stream
5.3.11. Selecting EventBean instead of Underlying Event
5.4. Specifying Event Streams: the From Clause
5.4.1. Filter-based Event Streams
5.4.2. Pattern-based Event Streams
5.4.3. Specifying Views
5.4.4. Multiple Data Window Views
5.4.5. Using the Stream Name
5.5. Specifying Search Conditions: the Where Clause
5.6. Aggregates and grouping: the Group-by Clause and the Having Clause
5.6.1. Using aggregate functions
5.6.2. Organizing statement results into groups: the Group-by clause
5.6.3. Using Group-By with Rollup, Cube and Grouping Sets
5.6.4. Specifying grouping for each aggregation function
5.6.5. Selecting groups of events: the Having clause
5.6.6. How the stream filter, Where, Group By and Having clauses interact
5.6.7. Comparing Keyed Segmented Context, the Group By clause and the std:groupwin view
5.7. Stabilizing and Controlling Output: the Output Clause
5.7.1. Output Clause Options
5.7.2. Aggregation, Group By, Having and Output clause interaction
5.7.3. Runtime Considerations
5.8. Sorting Output: the Order By Clause
5.9. Limiting Row Count: the Limit Clause
5.10. Merging Streams and Continuous Insertion: the Insert Into Clause
5.10.1. Transposing a Property To a Stream
5.10.2. Merging Streams By Event Type
5.10.3. Merging Disparate Types of Events: Variant Streams
5.10.4. Decorated Events
5.10.5. Event as a Property
5.10.6. Instantiating and Populating an Underlying Event Object
5.10.7. Transposing an Expression Result
5.10.8. Select-Clause Expression And Inserted-Into Column Event Type
5.11. Subqueries
5.11.1. The 'exists' Keyword
5.11.2. The 'in' and 'not in' Keywords
5.11.3. The 'any' and 'some' Keywords
5.11.4. The 'all' Keyword
5.11.5. Subquery With Group By Clause
5.11.6. Multi-Column Selection
5.11.7. Multi-Row Selection
5.11.8. Hints Related to Subqueries
5.12. Joining Event Streams
5.12.1. Introducing Joins
5.12.2. Inner (Default) Joins
5.12.3. Outer, Left and Right Joins
5.12.4. Unidirectional Joins
5.12.5. Hints Related to Joins
5.13. Accessing Relational Data via SQL
5.13.1. Joining SQL Query Results
5.13.2. SQL Query and the EPL Where Clause
5.13.3. Outer Joins With SQL Queries
5.13.4. Using Patterns to Request (Poll) Data
5.13.5. Polling SQL Queries via Iterator
5.13.6. JDBC Implementation Overview
5.13.7. Oracle Drivers and No-Metadata Workaround
5.13.8. SQL Input Parameter and Column Output Conversion
5.13.9. SQL Row POJO Conversion
5.14. Accessing Non-Relational Data via Method Invocation
5.14.1. Joining Method Invocation Results
5.14.2. Polling Method Invocation Results via Iterator
5.14.3. Providing the Method
5.14.4. Using a Map Return Type
5.14.5. Using a Object Array Return Type
5.15. Declaring an Event Type: Create Schema
5.15.1. Declare an Event Type by Providing Names and Types
5.15.2. Declare an Event Type by Providing a Class Name
5.15.3. Declare a Variant Stream
5.16. Splitting and Duplicating Streams
5.17. Variables and Constants
5.17.1. Creating Variables: the Create Variable clause
5.17.2. Setting Variable Values: the On Set clause
5.17.3. Using Variables
5.17.4. Object-Type Variables
5.17.5. Class and Event-Type Variables
5.18. Declaring Global Expressions, Aliases And Scripts: Create Expression
5.18.1. Global Expression Aliases
5.18.2. Global Expression Declarations
5.18.3. Global Scripts
5.19. Contained-Event Selection
5.19.1. Select-Clause in a Contained-Event Selection
5.19.2. Where Clause in a Contained-Event Selection
5.19.3. Contained-Event Selection and Joins
5.19.4. Sentence and Word Example
5.19.5. More Examples
5.19.6. Arrays returned by a Contained Expression
5.19.7. Contained-Event Limitations
5.20. Updating an Insert Stream: the Update IStream Clause
5.20.1. Immutability and Updates
5.21. Controlling Event Delivery : The For Clause

The Event Processing Language (EPL) is a SQL-standard language with extensions, offering SELECT, FROM, WHERE, GROUP BY, HAVING and ORDER BY clauses. Streams replace tables as the source of data with events replacing rows as the basic unit of data. Since events are composed of data, the SQL concepts of correlation through joins, filtering and aggregation through grouping can be effectively leveraged.

The INSERT INTO clause is recast as a means of forwarding events to other streams for further downstream processing. External data accessible through JDBC may be queried and joined with the stream data. Additional clauses such as the PATTERN and OUTPUT clauses are also available to provide the missing SQL language constructs specific to event processing.

The purpose of the UPDATE clause is to update event properties. Update takes place before an event applies to any selecting statements or pattern statements.

EPL statements are used to derive and aggregate information from one or more streams of events, and to join or merge event streams. This section outlines EPL syntax. It also outlines the built-in views, which are the building blocks for deriving and aggregating information from event streams.

EPL statements contain definitions of one or more views. Similar to tables in a SQL statement, views define the data available for querying and filtering. Some views represent windows over a stream of events. Other views derive statistics from event properties, group events or handle unique event property values. Views can be staggered onto each other to build a chain of views. The Esper engine makes sure that views are reused among EPL statements for efficiency.

The built-in set of views is:

EPL provides the concept of named window. Named windows are data windows that can be inserted-into and deleted-from by one or more statements, and that can queried by one or more statements. Named windows have a global character, being visible and shared across an engine instance beyond a single statement. Use the CREATE WINDOW clause to create named windows. Use the ON MERGE clause to atomically merge events into named window state, the INSERT INTO clause to insert data into a named window, the ON DELETE clause to remove events from a named window, the ON UPDATE clause to update events held by a named window and the ON SELECT clause to perform a query triggered by a pattern or arriving event on a named window. Finally, the name of the named window can occur in a statement's FROM clause to query a named window or include the named window in a join or subquery.

EPL provides the concept of table. Tables are globally-visible data structures that typically have primary key columns and that can hold aggregation state. You can create tables using CREATE TABLE. An overview of named windows and tables, and a comparison between them, can be found at Section 6.1, “Overview”. The aforementioned ON SELECT/MERGE/UPDATE/INSERT/DELETE, INSERT INTO as well as joins and subqueries can be used with tables as well.

EPL allows execution of on-demand (fire-and-forget, non-continuous, triggered by API) queries against named windows and tables through the runtime API. The query engine automatically indexes named window data for fast access by ON SELECT/MERGE/UPDATE/INSERT/DELETE without the need to create an index explicitly, or can access explicit (secondary) table indexes for operations on tables. For fast on-demand query execution via runtime API use the CREATE INDEX syntax to create an explicit index for the named window or table in question.

Use CREATE SCHEMA to declare an event type.

Variables can come in handy to parameterize statements and change parameters on-the-fly and in response to events. Variables can be used in an expression anywhere in a statement as well as in the output clause for dynamic control of output rates.

Esper can be extended by plugging-in custom developed views and aggregation functions.

EPL queries are created and stored in the engine, and publish results to listeners as events are received by the engine or timer events occur that match the criteria specified in the query. Events can also be obtained from running EPL queries via the safeIterator and iterator methods that provide a pull-data API.

The select clause in an EPL query specifies the event properties or events to retrieve. The from clause in an EPL query specifies the event stream definitions and stream names to use. The where clause in an EPL query specifies search conditions that specify which event or event combination to search for. For example, the following statement returns the average price for IBM stock ticks in the last 30 seconds.

select avg(price) from StockTick.win:time(30 sec) where symbol='IBM'

EPL queries follow the below syntax. EPL queries can be simple queries or more complex queries. A simple select contains only a select clause and a single stream definition. Complex EPL queries can be build that feature a more elaborate select list utilizing expressions, may join multiple streams, may contain a where clause with search conditions and so on.

[annotations]
[expression_declarations]
[context context_name]
[into table table_name]
[insert into insert_into_def]
select select_list
from stream_def [as name] [, stream_def [as name]] [,...]
[where search_conditions]
[group by grouping_expression_list]
[having grouping_search_conditions]
[output output_specification]
[order by order_by_expression_list]
[limit num_rows]

Certain words such as select, delete or set are reserved and may not be used as identifiers. Please consult Appendix B, Reserved Keywords for the list of reserved keywords and permitted keywords.

Names of built-in functions and certain auxiliary keywords are permitted as event property names and in the rename syntax of the select clause. For example, count is acceptable.

Consider the example below, which assumes that 'last' is an event property of MyEvent:

// valid
select last, count(*) as count from MyEvent

This example shows an incorrect use of a reserved keyword:

// invalid
select insert from MyEvent

EPL offers an escape syntax for reserved keywords: Event properties as well as event or stream names may be escaped via the backwards apostrophe ` (ASCII 96) character.

The next example queries an event type by name Order (a reserved keyword) that provides a property by name insert (a reserved keyword):

// valid
select `insert` from `Order`

You may surround string values by either double-quotes (") or single-quotes ('). When your string constant in an EPL statement itself contains double quotes or single quotes, you must escape the quotes.

Double and single quotes may be escaped by the backslash (\) character or by unicode notation. Unicode 0027 is a single quote (') and 0022 is a double quote (").

Escaping event property names is described in Section 2.2.1, “Escape Characters”.

The sample EPL below escapes the single quote in the string constant John's, and filters out order events where the name value matches:

select * from OrderEvent(name='John\'s')
// ...equivalent to...
select * from OrderEvent(name='John\u0027s')

The next EPL escapes the string constant Quote "Hello":

select * from OrderEvent(description like "Quote \"Hello\"")
// is equivalent to
select * from OrderEvent(description like "Quote \u0022Hello\u0022")

When building an escape string via the API, escape the backslash, as shown in below code snippet:

epService.getEPAdministrator().createEPL("select * from OrderEvent(name='John\\'s')");
// ... and for double quotes...
epService.getEPAdministrator().createEPL("select * from OrderEvent(
  description like \"Quote \\\"Hello\\\"\")");

EPL honors all Java built-in primitive and boxed types, including java.math.BigInteger and java.math.BigDecimal.

EPL also follows Java standards in terms of widening, performing widening automatically in cases where widening type conversion is allowed without loss of precision, for both boxed and primitive types and including BigInteger and BigDecimal:

  1. byte to short, int, long, float, double, BigInteger or BigDecimal

  2. short to int, long, float, or double, BigInteger or BigDecimal

  3. char to int, long, float, or double, BigInteger or BigDecimal

  4. int to long, float, or double, BigInteger or BigDecimal

  5. long to float or double, BigInteger or BigDecimal

  6. float to double or BigDecimal

  7. double to BigDecimal

In cases where loss of precision is possible because of narrowing requirements, EPL compilation outputs a compilation error.

EPL supports casting via the cast function.

EPL returns double-type values for division regardless of operand type. EPL can also be configured to follow Java rules for integer arithmetic instead as described in Section 16.4.22, “Engine Settings related to Expression Evaluation”.

Division by zero returns positive or negative infinity. Division by zero can be configured to return null instead.

This chapter is about Java language constants and enum types and their use in EPL expressions.

Java language constants are public static final fields in Java that may participate in expressions of all kinds, as this example shows:

select * from MyEvent where property = MyConstantClass.FIELD_VALUE

Event properties that are enumeration values can be compared by their enum type value:

select * from MyEvent where enumProp = EnumClass.ENUM_VALUE_1

Event properties can also be passed to enum type functions or compared to an enum type method result:

select * from MyEvent where somevalue = EnumClass.ENUM_VALUE_1.getSomeValue()
  or EnumClass.ENUM_VALUE_2.analyze(someothervalue)

Enum types have a valueOf method that returns the enum type value:

select * from MyEvent where enumProp = EnumClass.valueOf('ENUM_VALUE_1')

If your application does not import, through configuration, the package that contains the enumeration class, then it must also specify the package name of the class. Enum types that are inner classes must be qualified with $ following Java conventions.

For example, the Color enum type as an inner class to MyEvent in package org.myorg can be referenced as shown:

select * from MyEvent(enumProp=org.myorg.MyEvent$Color.GREEN).std:firstevent()

Instance methods may also be invoked on event instances by specifying a stream name, as shown below:

select myevent.computeSomething() as result from MyEvent as myevent

Chaining instance methods is supported as this example shows:

select myevent.getComputerFor('books', 'movies').calculate() as result 
from MyEvent as myevent

An annotation is an addition made to information in a statement. Esper provides certain built-in annotations for defining statement name, adding a statement description or for tagging statements such as for managing statements or directing statement output. Other than the built-in annotations, applications can provide their own annotation classes that the EPL compiler can populate.

An annotation is part of the statement text and precedes the EPL select or pattern statement. Annotations are therefore part of the EPL grammar. The syntax for annotations follows the host language (Java, .NET) annotation syntax:

@annotation_name [(annotation_parameters)]

An annotation consists of the annotation name and optional annotation parameters. The annotation_name is the simple class name or fully-qualified class name of the annotation class. The optional annotation_parameters are a list of key-value pairs following the syntax:

@annotation_name (attribute_name = attribute_value, [name=value, ...])

The attribute_name is an identifier that must match the attributes defined by the annotation class. An attribute_value is a constant of any of the primitive types or string, an array, an enum type value or another (nested) annotation. Null values are not allowed as annotation attribute values. Enumeration values are supported in EPL statements and not support in statements created via the createPattern method.

Use the getAnnotations method of EPStatement to obtain annotations provided via statement text.

The name of built-in annotations is not case-sensitive, allowing both @NAME or @name, for example.

The list of built-in EPL statement-level annotations is:

Table 5.2. Built-In EPL Statement Annotations

NamePurpose and AttributesExample
Name

Provides a statement name. Attributes are:

value : Statement name.

@Name("MyStatementName")
Description

Provides a statement textual description. Attributes are:

value : Statement description.

@Description("Place statement
description here.")
Tag

For tagging a statement with additional information. Attributes are:

name : Tag name.

value : Tag value.

@Tag(name="MyTagName", 
 value="MyTagValue")
Priority

Applicable when an event (or schedule) matches filter criteria for multiple statements: Defines the order of statement processing (requires an engine-level setting).

Attributes are:

value : priority value.

@Priority(10)
Drop

Applicable when an event (or schedule) matches filter criteria for multiple statements, drops the event after processing the statement (requires an engine-level setting).

No attributes.

@Drop
Hint

For providing one or more hints towards how the engine should execute a statement. Attributes are:

value : A comma-separated list of one or more case-insensitive keywords.

@Hint('iterate_only')
Hook

Use this annotation to register one or more statement-specific hooks providing a hook type for each individual hook, such as for SQL parameter, column or row conversion.

Attributes are the hook type and the hook itself (usually a import or class name):

@Hook(type=HookType.SQLCOL,
  hook='MyDBTypeConvertor')
Audit

Causes the engine to output detailed processing information for a statement.

optional value : A comma-separated list of one or more case-insensitive keywords.

@Audit
EventRepresentation

Causes the engine to use object-array event representation, if possible, for output and internal event types.

@EventRepresentation(array=true)
IterableUnbound

For use when iterating statements with unbound streams, instructs the engine to retain the last event for iterating.

@IterableUnbound

The following example statement text specifies some of the built-in annotations in combination:

@Name("RevenuePerCustomer")
@Description("Outputs revenue per customer considering all events encountered so far.")
@Tag(name="grouping", value="customer")

select customerId, sum(revenue) from CustomerRevenueEvent

This annotation only takes effect if the engine-level setting for prioritized execution is set via configuration, as described in Section 16.4.23, “Engine Settings related to Execution of Statements”.

Use the @Priority EPL annotation to tag statements with a priority value. The default priority value is zero (0) for all statements. When an event (or single timer execution) requires processing the event for multiple statements, processing begins with the highest priority statement and ends with the lowest-priority statement.

Example:

@Priority(10) select * from SecurityFilter(ip="127.0.0.1")

This annotation only takes effect if the engine-level setting for prioritized execution is set via configuration, as described in Section 16.4.23, “Engine Settings related to Execution of Statements”.

Use the @Drop EPL annotation to tag statements that preempt all other same or lower-priority statements. When an event (or single timer execution) requires processing the event for multiple statements, processing begins with the highest priority statement and ends with the first statement marked with @Drop, which becomes the last statement to process that event.

Unless a different priority is specified, the statement with the @Drop EPL annotation executes at priority 1. Thereby @Drop alone is an effective means to remove events from a stream.

Example:

@Drop select * from SecurityFilter(ip="127.0.0.1")

An expression alias simply assigns a name to an expression. The alias name can be used in other expressions to refer to that expression, without the need to duplicate the expression.

The expression alias obtains its scope from where it is used. Parameters cannot be provided. A second means to sharing expressions is the expression declaration as described next, which allows passing parameters and is more tightly scoped.

An EPL statement can contain and refer to any number of expression aliases. For expressions aliases that are visible across multiple EPL statements please consult Section 5.18.1, “Global Expression Aliases” that explains the create expression clause.

The syntax for an expression alias is:

expression expression_name alias for { expression }

An expression alias consists of the expression name and an expression in curly braces. The return type of the expression is determined by the engine and need not be specified. The scope is automatic and determined by where the alias name is used therefore parameters cannot be specified.

This example declares an expression alias twoPI that substitutes Math.PI * 2:

expression twoPI alias for { Math.PI * 2 }
select twoPI from SampleEvent

The next example specifies an alias countPeople and uses the alias in the select-clause and the having-clause:

expression countPeople alias for { count(*) }
select countPeople from EnterRoomEvent.win:time(10 seconds) having countPeople > 10

When using the expression alias in an expression, empty parentheses can optionally be specified. In the above example, countPeople() can be used instead and equivalently.

The following scope rules apply for expression aliases:

  1. Expression aliases do not remove implicit limitations: For example, aggregation functions cannot be used in a filter expression even if assigned an alias.

An EPL statement can contain expression declarations. Expressions that are common to multiple places in the same EPL statement can be moved to a named expression declaration and reused within the same statement without duplicating the expression itself.

For declaring expressions that are visible across multiple EPL statements i.e. globally visible expressions please consult Section 5.18.2, “Global Expression Declarations” that explains the create expression clause.

An expression declaration follows the lambda-style expression syntax. This syntax was chosen as it typically allows for a shorter and more concise expression body that can be easier to read then most procedural code.

The syntax for an expression declaration is:

expression expression_name { expression_body }

An expression declaration consists of the expression name and an expression body. The expression_name is any identifier. The expression_body contains optional parameters and the expression. The parameter types and the return type of the expression is determined by the engine and do not need to be specified.

Parameters to a declared expression can be a stream name, pattern tag name or wildcard (*). Use wildcard to pass the event itself to the expression. In a join or subquery, or more generally in an expression where multiple streams or pattern tags are available, the EPL must specify the stream name or pattern tag name and cannot use wildcard.

In the expression body the => lambda operator reads as "goes to" (-> may be used and is equivalent). The left side of the lambda operator specifies the input parameters (if any) and the right side holds the expression. The lambda expression x => x * x is read "x goes to x times x".

In the expression body, if your expression takes no parameters, you may simply specify the expression and do not need the => lambda operator.

If your expression takes one parameters, specify the input parameter name followed by the => lambda operator and followed by the expression. The synopsis for use with a single input parameter is:

expression_body:   input_param_name => expression 

If your expression takes two or more parameters, specify the input parameter names in parenthesis followed by the => lambda operator followed by the expression. The synopsis for use with a multiple input parameter is:

expression_body:   (input_param [,input_param [,...]]) => expression 

The following example declares an expression that returns two times PI (ratio of the circumference of a circle to its diameter) and demonstrates its use in a select-clause:

expression twoPI { Math.PI * 2} select twoPI() from SampleEvent

The parentheses are optional when the expression accepts no parameters. The below is equivalent to the previous example:

expression twoPI { Math.PI * 2} select twoPI from SampleEvent

The next example declares an expression that accepts one parameter: a MarketData event. The expression computes a new "mid" price based on the buy and sell price:

expression midPrice { x => (x.buy + x.sell) / 2 } 
select midPrice(md) from MarketDataEvent as md

The variable name can be left off if event property names resolve without ambiguity.

This example EPL removes the variable name x:

expression midPrice { x => (buy + sell) / 2 } 
select midPrice(md) from MarketDataEvent as md

The next example EPL specifies wildcard instead:

expression midPrice { x => (buy + sell) / 2 } 
select midPrice(*) from MarketDataEvent

A further example that demonstrates two parameters is listed next. The example joins two streams and uses the price value from MarketDataEvent and the sentiment value of NewsEvent to compute a weighted sentiment:

expression weightedSentiment { (x, y) => x.price * y.sentiment } 
select weightedSentiment(md, news) 
from MarketDataEvent.std:lastevent() as md, NewsEvent.std:lastevent() news

Any expression can be used in the expression body including aggregations, variables, subqueries or further declared or alias expressions. Sub-queries, when used without in or exists, must be placed within parenthesis.

An example subquery within an expression declaration is shown next:

expression newsSubq { md -> 
    (select sentiment from NewsEvent.std:unique(symbol) where symbol = md.symbol) 
} 
select newsSubq(mdstream)
from MarketDataEvent mdstream

When using expression declarations please note these limitations:

  1. Parameters to a declared expression can only be a stream name, pattern tag name or wildcard (*).

  2. Expression declarations do not remove implicit limitations: For example, aggregation functions cannot be used in a filter expression even if using an expression declaration.

The following scope rules apply for declared expressions:

  1. The scope of the expression body of a declared expression only includes the parameters explicitly listed. Consider using an expression alias instead.

You may refer to a context in the EPL text by specifying the context keyword followed by a context name. Context are described in more detail at Chapter 4, Context and Context Partitions

The effect of referring to a context is that your statement operates according to the context dimensional information as declared for the context.

The synopsis is:

... context context_name ...

You may refer to a context in all statements except for the following types of statements:

  1. create schema for declaring event types.

  2. create variable for declaring a variable.

  3. create index for creating an index on a named window or table.

  4. update istream for updating insert stream events.

The select clause is required in all EPL statements. The select clause can be used to select all properties via the wildcard *, or to specify a list of event properties and expressions. The select clause defines the event type (event property names and types) of the resulting events published by the statement, or pulled from the statement via the iterator methods.

The select clause also offers optional istream, irstream and rstream keywords to control whether input stream, remove stream or input and remove stream events are posted to UpdateListener instances and observers to a statement. By default, the engine provides only the insert stream to listener and observers. See Section 16.4.17, “Engine Settings related to Stream Selection” on how to change the default.

The syntax for the select clause is summarized below.

select [istream | irstream | rstream] [distinct] * | expression_list ... 

The istream keyword is the default, and indicates that the engine only delivers insert stream events to listeners and observers. The irstream keyword indicates that the engine delivers both insert and remove stream. Finally, the rstream keyword tells the engine to deliver only the remove stream.

The distinct keyword outputs only unique rows depending on the column list you have specified after it. It must occur after the select and after the optional stream keywords, as described in more detail below.

The syntax for selecting all event properties in a stream is:

select * from stream_def

The following statement selects StockTick events for the last 30 seconds of IBM stock ticks.

select * from StockTick(symbol='IBM').win:time(30 sec)

You may well be asking: Why does the statement specify a time window here? First, the statement is meant to demonstrate the use of * wildcard. When the engine pushes statement results to your listener and as the statement does not select remove stream events via rstream keyword, the listener receives only new events and the time window could be left off. By adding the time window the pull API (iterator API or JDBC driver) returns the last 30 seconds of events.

The * wildcard and expressions can also be combined in a select clause. The combination selects all event properties and in addition the computed values as specified by any additional expressions that are part of the select clause. Here is an example that selects all properties of stock tick events plus a computed product of price and volume that the statement names 'pricevolume':

select *, price * volume as pricevolume from StockTick

When using wildcard (*), Esper does not actually read or copy your event properties out of your event or events, neither does it copy the event object. It simply wraps your native type in an EventBean interface. Your application has access to the underlying event object through the getUnderlying method and has access to the property values through the get method.

In a join statement, using the select * syntax selects one event property per stream to hold the event for that stream. The property name is the stream name in the from clause.

If your statement is joining multiple streams, your may specify property names that are unique among the joined streams, or use wildcard (*) as explained earlier.

In case the property name in your select or other clauses is not unique considering all joined streams, you will need to use the name of the stream as a prefix to the property.

This example is a join between the two streams StockTick and News, respectively named as 'tick' and 'news'. The example selects from the StockTick event the symbol value using the 'tick' stream name as a prefix:

select tick.symbol from StockTick.win:time(10) as tick, News.win:time(10) as news
where news.symbol = tick.symbol

Use the wildcard (*) selector in a join to generate a property for each stream, with the property value being the event itself. The output events of the statement below have two properties: the 'tick' property holds the StockTick event and the 'news' property holds the News event:

select * from StockTick.win:time(10) as tick, News.win:time(10) as news

The following syntax can also be used to specify what stream's properties to select:

select stream_name.* [as name] from ...

The selection of tick.* selects the StockTick stream events only:

select tick.* from StockTick.win:time(10) as tick, News.win:time(10) as news
where tick.symbol = news.symbol

The next example uses the as keyword to name each stream's joined events. This instructs the engine to create a property for each named event:

select tick.* as stocktick, news.* as news 
from StockTick.win:time(10) as tick, News.win:time(10) as news
where stock.symbol = news.symbol

The output events of the above example have two properties 'stocktick' and 'news' that are the StockTick and News events.

The stream name itself, as further described in Section 5.4.5, “Using the Stream Name”, may be used within expressions or alone.

This example passes events to a user-defined function named compute and also shows insert-into to populate an event stream of combined events:

insert into TickNewStream select tick, news, MyLib.compute(news, tick) as result
from StockTick.win:time(10) as tick, News.win:time(10) as news
where tick.symbol = news.symbol
// second statement that uses the TickNewStream stream
select tick.price, news.text, result from TickNewStream

In summary, the stream_name.* streamname wildcard syntax can be used to select a stream as the underlying event or as a property, but cannot appear within an expression. While the stream_name syntax (without wildcard) always selects a property (and not as an underlying event), and can occur anywhere within an expression.

If your statement employs pattern expressions, then your pattern expression tags events with a tag name. Each tag name becomes available for use as a property in the select clause and all other clauses.

For example, here is a very simple pattern that matches on every StockTick event received within 30 seconds after start of the statement. The sample selects the symbol and price properties of the matching events:

select tick.symbol as symbol, tick.price as price
from pattern[every tick=StockTick where timer:within(10 sec)]

The use of the wildcard selector, as shown in the next statement, creates a property for each tagged event in the output. The next statement outputs events that hold a single 'tick' property whose value is the event itself:

select * from pattern[every tick=StockTick where timer:within(10 sec)]

You may also select the matching event itself using the tick.* syntax. The engine outputs the StockTick event itself to listeners:

select tick.* from pattern[every tick=StockTick where timer:within(10 sec)]

A tag name as specified in a pattern is a valid expression itself. This example uses the insert into clause to make available the events matched by a pattern to further statements:

// make a new stream of ticks and news available
insert into StockTickAndNews 
select tick, news from pattern [every tick=StockTick -> news=News(symbol=tick.symbol)]
// second statement to select from the stream of ticks and news
select tick.symbol, tick.price, news.text from StockTickAndNews

The optional istream, irstream and rstream keywords in the select clause control the event streams posted to listeners and observers to a statement.

If neither keyword is specified, and in the default engine configuration, the engine posts only insert stream events via the newEvents parameter to the update method of UpdateListener instances listening to the statement. The engine does not post remove stream events, by default.

The insert stream consists of the events entering the respective window(s) or stream(s) or aggregations, while the remove stream consists of the events leaving the respective window(s) or the changed aggregation result. See Chapter 3, Processing Model for more information on insert and remove streams.

The engine posts remove stream events to the oldEvents parameter of the update method only if the irstream keyword occurs in the select clause. This behavior can be changed via engine-wide configuration as described in Section 16.4.17, “Engine Settings related to Stream Selection”.

By specifying the istream keyword you can instruct the engine to only post insert stream events via the newEvents parameter to the update method on listeners. The engine will then not post any remove stream events, and the oldEvents parameter is always a null value.

By specifying the irstream keyword you can instruct the engine to post both insert stream and remove stream events.

By specifying the rstream keyword you can instruct the engine to only post remove stream events via the newEvents parameter to the update method on listeners. The engine will then not post any insert stream events, and the oldEvents parameter is also always a null value.

The following statement selects only the events that are leaving the 30 second time window.

select rstream * from StockTick.win:time(30 sec)

The istream and rstream keywords in the select clause are matched by same-name keywords available in the insert into clause. While the keywords in the select clause control the event stream posted to listeners to the statement, the same keywords in the insert into clause specify the event stream that the engine makes available to other statements.

The optional distinct keyword removes duplicate output events from output. The keyword must occur after the select keyword and after the optional irstream keyword.

The distinct keyword in your select instructs the engine to consolidate, at time of output, the output event(s) and remove output events with identical property values. Duplicate removal only takes place when two or more events are output together at any one time, therefore distinct is typically used with a batch data window, output rate limiting, on-demand queries, on-select or iterator pull API.

If two or more output event objects have same property values for all properties of the event, the distinct removes all but one duplicated event before outputting events to listeners. Indexed, nested and mapped properties are considered in the comparison, if present in the output event.

The next example outputs sensor ids of temperature sensor events, but only every 10 seconds and only unique sensor id values during the 10 seconds:

select distinct sensorId from TemperatureSensorEvent output every 10 seconds

Use distinct with wildcard (*) to remove duplicate output events considering all properties of an event.

This example statement outputs all distinct events either when 100 events arrive or when 10 seconds passed, whichever occurs first:

select distinct * from TemperatureSensorEvent.win:time_length_batch(10, 100)

When selecting nested, indexed, mapped or dynamic properties in a select clause with distinct, it is relevant to know that the comparison uses hash code and the Java equals semantics.

By default, for certain select-clause expressions that output events or a collection of events, the engine outputs the underlying event objects. With outputs we refer to the data passed to listeners, subscribers and inserted-into into another stream via insert-into.

The select-clause expressions for which underlying event objects are output by default are:

To have the engine output EventBean instance(s) instead, add @eventbean to the relevant expressions of the select-clause.

The sample EPL shown below outputs current data window contents as EventBean instances into the stream OutStream, thereby statements consuming the stream may operate on such instances:

insert into OutStream 
select prevwindow(s0) @eventbean as win 
from MyEvent.win:length(2) as s0

The next EPL consumes the stream and selects the last event:

select win.lastOf() from OutStream

It is not necessary to use @eventbean if an event type by the same name (OutStream in the example) is already declared and a property exist on the type by the same name (win in this example) and the type of the property is the event type (MyEvent in the example) returned by the expression. This is further described in Section 5.10.8, “Select-Clause Expression And Inserted-Into Column Event Type”.

The from clause is required in all EPL statements. It specifies one or more event streams, named windows or tables. Each event stream, named window or table can optionally be given a name by means of the as keyword.

from stream_def [as name] [unidirectional] [retain-union | retain-intersection] 
    [, stream_def [as stream_name]] [, ...]

The event stream definition stream_def as shown in the syntax above can consists of either a filter-based event stream definition or a pattern-based event stream definition.

For joins and outer joins, specify two or more event streams. Joins between pattern-based and filter-based event streams are also supported. Joins and the unidirectional keyword are described in more detail in Section 5.12, “Joining Event Streams”.

Esper supports joins against relational databases for access to historical or reference data as explained in Section 5.13, “Accessing Relational Data via SQL”. Esper can also join results returned by an arbitrary method invocation, as discussed in Section 5.14, “Accessing Non-Relational Data via Method Invocation”.

The stream_name is an optional identifier assigned to the stream. The stream name can itself occur in any expression and provides access to the event itself from the named stream. Also, a stream name may be combined with a method name to invoke instance methods on events of that stream.

For all streams with the exception of historical sources your query may employ data window views as outlined below. The retain-intersection (the default) and retain-union keywords build a union or intersection of two or more data windows as described in Section 5.4.4, “Multiple Data Window Views”.

The stream_def syntax for a filter-based event stream is as below:

event_stream_name [(filter_criteria)] [contained_selection] [.view_spec] [.view_spec] [...]

The event_stream_name is either the name of an event type or name of an event stream populated by an insert into statement or the name of a named window or table.

The filter_criteria is optional and consists of a list of expressions filtering the events of the event stream, within parenthesis after the event stream name. Filter criteria cannot be specified for tables.

The contained_selection is optional and is for use with coarse-grained events that have properties that are themselves one or more events, see Section 5.19, “Contained-Event Selection” for the synopsis and examples. Contained-event cannot be specified for tables.

The view_spec are optional view specifications, which are combinable definitions for retaining events and for deriving information from events. Views cannot be specified for tables.

The following EPL statement shows event type, filter criteria and views combined in one statement. It selects all event properties for the last 100 events of IBM stock ticks for volume. In the example, the event type is the fully qualified Java class name org.esper.example.StockTick. The expression filters for events where the property symbol has a value of "IBM". The optional view specifications for deriving data from the StockTick events are a length window and a view for computing statistics on volume. The name for the event stream is "volumeStats".

select * from 
  org.esper.example.StockTick(symbol='IBM').win:length(100).stat:uni(volume) as volumeStats

Esper filters out events in an event stream as defined by filter criteria before it sends events to subsequent views. Thus, compared to search conditions in a where clause, filter criteria remove unneeded events early. In the above example, events with a symbol other than IBM do not enter the time window.

The filtering criteria to filter for events with certain event property values are placed within parenthesis after the event type name:

select * from RfidEvent(category="Perishable")

All expressions can be used in filters, including static methods that return a boolean value:

select * from com.mycompany.RfidEvent(MyRFIDLib.isInRange(x, y) or (x < 0 and y < 0))

Filter expressions can be separated via a single comma ','. The comma represents a logical AND between filter expressions:

select * from RfidEvent(zone=1, category=10)
...is equivalent to...
select * from RfidEvent(zone=1 and category=10)

The following operators are highly optimized through indexing and are the preferred means of filtering in high-volume event streams and especially in the presence of a larger number of filters or statements:

At compile time as well as at run time, the engine scans new filter expressions for sub-expressions that can be indexed. Indexing filter values to match event properties of incoming events enables the engine to match incoming events faster, especially if your application creates a large number of statements or requires many similar filters. The above list of operators represents the set of operators that the engine can best convert into indexes. The use of comma or logical and in filter expressions does not impact optimizations by the engine.

Event pattern expressions can also be used to specify one or more event streams in an EPL statement. For pattern-based event streams, the event stream definition stream_def consists of the keyword pattern and a pattern expression in brackets []. The syntax for an event stream definition using a pattern expression is below. As in filter-based event streams, an optional list of views that derive data from the stream can be supplied.

pattern [pattern_expression] [.view_spec] [.view_spec] [...]

The next statement specifies an event stream that consists of both stock tick events and trade events. The example tags stock tick events with the name "tick" and trade events with the name "trade".

select * from pattern [every tick=StockTickEvent or every trade=TradeEvent]

This statement generates an event every time the engine receives either one of the event types. The generated events resemble a map with "tick" and "trade" keys. For stock tick events, the "tick" key value is the underlying stock tick event, and the "trade" key value is a null value. For trade events, the "trade" key value is the underlying trade event, and the "tick" key value is a null value.

Lets further refine this statement adding a view the gives us the last 30 seconds of either stock tick or trade events. Lets also select prices and a price total.

select tick.price as tickPrice, trade.price as tradePrice, 
       sum(tick.price) + sum(trade.price) as total
  from pattern [every tick=StockTickEvent or every trade=TradeEvent].win:time(30 sec)

Note that in the statement above tickPrice and tradePrice can each be null values depending on the event processed. Therefore, an aggregation function such as sum(tick.price + trade.price)) would always return null values as either of the two price properties are always a null value for any event matching the pattern. Use the coalesce function to handle null values, for example: sum(coalesce(tick.price, 0) + coalesce(trade.price, 0)).

Views are used to specify an expiry policy for events (data window views) and also to derive data. Views can be staggered onto each other. See the section Chapter 13, EPL Reference: Views on the views available that also outlines the different types of views: Data Window views and Derived-Value views.

Views can optionally take one or more parameters. These parameters are expressions themselves that may consist of any combination of variables, arithmetic, user-defined function or substitution parameters for prepared statements, for example.

The example statement below outputs a count per expressway for car location events (contains information about the location of a car on a highway) of the last 60 seconds:

select expressway, count(*) from CarLocEvent.win:time(60) 
group by expressway

The next example serves to show staggering of views. It uses the std:groupwin view to create a separate length window per car id:

select cardId, expressway, direction, segment, count(*) 
from CarLocEvent.std:groupwin(carId).win:length(4) 
group by carId, expressway, direction, segment

The first view std:groupwin(carId) groups car location events by car id. The second view win:length(4) keeps a length window of the 4 last events, with one separate length window for each car id. The example reports the number of events per car id and per expressway, direction and segment considering the last 4 events for each car id only.

Note that the group by syntax is generally preferable over std:groupwin for grouping information as it is SQL-compliant, easier to read and does not create a separate data window per group. The std:groupwin in above example creates a separate data window (length window in the example) per group, demonstrating staggering views.

When views are staggered onto each other as a chain of views, then the insert and remove stream received by each view is the insert and remove stream made available by the view (or stream) earlier in the chain.

The special keep-all view keeps all events: It does not provide a remove stream, i.e. events are not removed from the keep-all view unless by means of the on-delete syntax or by revision events.

Data window views provide an expiry policy that indicates when to remove events from the data window, with the exception of the keep-all data window which has no expiry policy and the std:groupwin grouped-window view for allocating a new data window per group.

EPL allows the freedom to use multiple data window views onto a stream and thus combine expiry policies. Combining data windows into an intersection (the default) or a union can achieve a useful strategy for retaining events and expiring events that are no longer of interest. Named windows, tables and the on-delete syntax provide an additional degree of freedom.

In order to combine two or more data window views there is no keyword required. The retain-intersection keyword is the default and the retain-union keyword may instead be provided for a stream.

The concept of union and intersection come from Set mathematics. In the language of Set mathematics, two sets A and B can be "added" together: The intersection of A and B is the set of all things which are members of both A and B, i.e. the members two sets have "in common". The union of A and B is the set of all things which are members of either A or B.

Use the retain-intersection (the default) keyword to retain an intersection of all events as defined by two or more data windows. All events removed from any of the intersected data windows are entered into the remove stream. This is the default behavior if neither retain keyword is specified.

Use the retain-union keyword to retain a union of all events as defined by two or more data windows. Only events removed from all data windows are entered into the remove stream.

The next example statement totals the price of OrderEvent events in a union of the last 30 seconds and unique by product name:

select sum(price) from OrderEvent.win:time(30 sec).std:unique(productName) retain-union

In the above statement, all OrderEvent events that are either less then 30 seconds old or that are the last event for the product name are considered.

Here is an example statement totals the price of OrderEvent events in an intersection of the last 30 seconds and unique by product name:

select sum(price) from OrderEvent.win:time(30 sec).std:unique(productName) retain-intersection

In the above statement, only those OrderEvent events that are both less then 30 seconds old and are the last event for the product name are considered. The number of events that the engine retains is the number of unique events per product name in the last 30 seconds (and not the number of events in the last 30 seconds).

For an intersection the engine retains the minimal number of events representing that intersection. Thus when combining a time window of 30 seconds and a last-event window, for example, the number of events retained at any time is zero or one event (and not 30 seconds of events).

When combining a batch window into an intersection with another data window the combined data window gains batching semantics: Only when the batch criteria is fulfilled does the engine provide the batch of intersecting insert stream events. Multiple batch data windows may not be combined into an intersection.

In below table we provide additional examples for data window intersections:


For advanced users and for backward compatibility, it is possible to configure Esper to allow multiple data window views without either of the retain keywords, as described in Section 16.4.12.3, “Configuring Multi-Expiry Policy Defaults”.

Your from clause may assign a name to each stream. This assigned stream name can serve any of the following purposes.

First, the stream name can be used to disambiguate property names. The stream_name.property_name syntax uniquely identifies which property to select if property names overlap between streams. Here is an example:

select prod.productId, ord.productId from ProductEvent as prod, OrderEvent as ord

Second, the stream name can be used with a wildcard (*) character to select events in a join, or assign new names to the streams in a join:

// Select ProductEvent only
select prod.* from ProductEvent as prod, OrderEvent
// Assign column names 'product' and 'order' to each event
select prod.* as product, ord.* as order from ProductEvent as prod, OrderEvent as ord

Further, the stream name by itself can occur in any expression: The engine passes the event itself to that expression. For example, the engine passes the ProductEvent and the OrderEvent to the user-defined function 'checkOrder':

select prod.productId, MyFunc.checkOrder(prod, ord) 
from ProductEvent as prod, OrderEvent as ord

Last, you may invoke an instance method on each event of a stream, and pass parameters to the instance method as well. Instance method calls are allowed anywhere in an expression.

The next statement demonstrates this capability by invoking a method 'computeTotal' on OrderEvent events and a method 'getMultiplier' on ProductEvent events:

select ord.computeTotal(prod.getMultiplier()) from ProductEvent as prod, OrderEvent as ord

Instance methods may also be chained: Your EPL may invoke a method on the result returned by a method invocation.

Assume that your product event exposes a method getZone which returns a zone object. Assume that the Zone class declares a method checkZone. This example statement invokes a method chain:

select prod.getZone().checkZone("zone 1") from ProductEvent as prod

The where clause is an optional clause in EPL statements. Via the where clause event streams can be joined and correlated.

Tip

For filtering events in order to remove unwanted events, use the from clause instead as described in Section 5.4.1, “Filter-based Event Streams” or for patterns in Section 7.4, “Filter Expressions In Patterns”.

Place expressions that remove unwanted events into parenthesis right after the event type, like ... from OrderEvent(fraud.severity = 5 and amount > 500) .... There is related information at Section 3.4, “Filters and Where-clauses” and Section 21.2.5, “Prefer stream-level filtering over where-clause filtering”.

Any expression can be placed in the where clause. Typically you would use comparison operators =, < , > , >=, <=, !=, <>, is null, is not null and logical combinations via and and or for joining, correlating or comparing events. The where clause introduces join conditions as outlined in Section 5.12, “Joining Event Streams”.

Some examples are listed below.

...where settlement.orderId = order.orderId
...where exists (select orderId from Settlement.win:time(1 min) where settlement.orderId = order.orderId)

The aggregate functions are further documented in Section 10.2, “Aggregation Functions”. You can use aggregate functions to calculate and summarize data from event properties.

For example, to find out the total price for all stock tick events in the last 30 seconds, type:

select sum(price) from StockTickEvent.win:time(30 sec)

Aggregation functions do not require the use of data windows. The examples herein specify data windows for the purpose of example. An alternative means to instruct the engine when to start and stop aggregating and on what level to aggregate is via context declarations.

For example, to find out the total price for all stock tick events since statement start, type:

select sum(price) from StockTickEvent

Here is the syntax for aggregate functions:

aggregate_function( [all | distinct] expression [,expression [,...]] 
    [, group_by:local_group_by] )

You can apply aggregate functions to all events in an event stream window or other view, or to one or more groups of events. From each set of events to which an aggregate function is applied, Esper generates a single value.

Expression is usually an event property name. However it can also be a constant, function, or any combination of event property names, constants, and functions connected by arithmetic operators.

You can provide a grouping dimension for each aggregation function by providing the optional group_by parameter as part of aggregation function parameters. Please refer to Section 5.6.4, “Specifying grouping for each aggregation function”.

For example, to find out the average price for all stock tick events in the last 30 seconds if the price was doubled:

select avg(price * 2) from StockTickEvent.win:time(30 seconds)

You can use the optional keyword distinct with all aggregate functions to eliminate duplicate values before the aggregate function is applied. The optional keyword all which performs the operation on all events is the default.

You can use aggregation functions in a select clause and in a having clause. You cannot use aggregate functions in a where clause, but you can use the where clause to restrict the events to which the aggregate is applied. The next query computes the average and sum of the price of stock tick events for the symbol IBM only, for the last 10 stock tick events regardless of their symbol.

select 'IBM stats' as title, avg(price) as avgPrice, sum(price) as sumPrice
from StockTickEvent.win:length(10)
where symbol='IBM'

In the above example the length window of 10 elements is not affected by the where clause, i.e. all events enter and leave the length window regardless of their symbol. If we only care about the last 10 IBM events, we need to add filter criteria as below.

select 'IBM stats' as title, avg(price) as avgPrice, sum(price) as sumPrice
from StockTickEvent(symbol='IBM').win:length(10)
where symbol='IBM'

You can use aggregate functions with any type of event property or expression, with the following exceptions:

  1. You can use sum, avg, median, stddev, avedev with numeric event properties only

Esper ignores any null values returned by the event property or expression on which the aggregate function is operating, except for the count(*) function, which counts null values as well. All aggregate functions return null if the data set contains no events, or if all events in the data set contain only null values for the aggregated expression.

The group by clause is optional in all EPL statements. The group by clause divides the output of an EPL statement into groups. You can group by one or more event property names, or by the result of computed expressions. When used with aggregate functions, group by retrieves the calculations in each subgroup. You can use group by without aggregate functions, but generally that can produce confusing results.

For example, the below statement returns the total price per symbol for all stock tick events in the last 30 seconds:

select symbol, sum(price) from StockTickEvent.win:time(30 sec) group by symbol

The syntax of the group by clause is:

group by aggregate_free_expression [, aggregate_free_expression] [, ...]

Esper places the following restrictions on expressions in the group by clause:

  1. Expressions in the group by cannot contain aggregate functions.

  2. When grouping an unbound stream, i.e. no data window is specified onto the stream providing groups, or when using output rate limiting with the ALL keyword, you should ensure your group-by expression does not return an unlimited number of values. If, for example, your group-by expression is a fine-grained timestamp, group state that accumulates for an unlimited number of groups potentially reduces available memory significantly. Use a @Hint as described below to instruct the engine when to discard group state.

You can list more then one expression in the group by clause to nest groups. Once the sets are established with group by the aggregation functions are applied. This statement posts the median volume for all stock tick events in the last 30 seconds per symbol and tick data feed. Esper posts one event for each group to statement listeners:

select symbol, tickDataFeed, median(volume) 
from StockTickEvent.win:time(30 sec) 
group by symbol, tickDataFeed

In the statement above the event properties in the select list (symbol, tickDataFeed) are also listed in the group by clause. The statement thus follows the SQL standard which prescribes that non-aggregated event properties in the select list must match the group by columns.

Esper also supports statements in which one or more event properties in the select list are not listed in the group by clause. The statement below demonstrates this case. It calculates the standard deviation since statement start over stock ticks aggregating by symbol and posting for each event the symbol, tickDataFeed and the standard deviation on price.

select symbol, tickDataFeed, stddev(price) from StockTickEvent group by symbol

The above example still aggregates the price event property based on the symbol, but produces one event per incoming event, not one event per group.

Additionally, Esper supports statements in which one or more event properties in the group by clause are not listed in the select list. This is an example that calculates the mean deviation per symbol and tickDataFeed and posts one event per group with symbol and mean deviation of price in the generated events. Since tickDataFeed is not in the posted results, this can potentially be confusing.

select symbol, avedev(price) 
from StockTickEvent.win:time(30 sec) 
group by symbol, tickDataFeed

Expressions are also allowed in the group by list:

select symbol * price, count(*) from StockTickEvent.win:time(30 sec) group by symbol * price

If the group by expression resulted in a null value, the null value becomes its own group. All null values are aggregated into the same group. If you are using the count(expression) aggregate function which does not count null values, the count returns zero if only null values are encountered.

You can use a where clause in a statement with group by. Events that do not satisfy the conditions in the where clause are eliminated before any grouping is done. For example, the statement below posts the number of stock ticks in the last 30 seconds with a volume larger then 100, posting one event per group (symbol).

select symbol, count(*) from StockTickEvent.win:time(30 sec) where volume > 100 group by symbol

The Esper engine reclaims aggregation state agressively when it determines that a group has no data points, based on the data in the data windows. When your application data creates a large number of groups with a small or zero number of data points then performance may suffer as state is reclaimed and created anew. Esper provides the @Hint('disable_reclaim_group') hint that you can specify as part of an EPL statement text to avoid group reclaim.

When aggregating values over an unbound stream (i.e. no data window is specified onto the stream) and when your group-by expression returns an unlimited number of values, for example when a timestamp expression is used, then please note the next hint.

A sample statement that aggregates stock tick events by timestamp, assuming the event type offers a property by name timestamp that, reflects time in high resolution, for example arrival or system time:

// Note the below statement could lead to an out-of-memory problem:
select symbol, sum(price) from StockTickEvent group by timestamp

As the engine has no means of detecting when aggregation state (sums per symbol) can be discarded, you may use the following hints to control aggregation state lifetime.

The @Hint("reclaim_group_aged=age_in_seconds") hint instructs the engine to discard aggregation state that has not been updated for age_in_seconds seconds.

The optional @Hint("reclaim_group_freq=sweep_frequency_in_seconds") can be used in addition to control the frequency at which the engine sweeps aggregation state to determine aggregation state age and remove state that is older then age_in_seconds seconds. If the hint is not specified, the frequency defaults to the same value as age_in_seconds.

The updated sample statement with both hints:

// Instruct engine to remove state older then 10 seconds and sweep every 5 seconds
@Hint('reclaim_group_aged=10,reclaim_group_freq=5')
select symbol, sum(price) from StockTickEvent group by timestamp

Variables may also be used to provide values for age_in_seconds and sweep_frequency_in_seconds.

This example statement uses a variable named varAge to control how long aggregation state remains in memory, and the engine defaults the sweep frequency to the same value as the variable provides:

@Hint('reclaim_group_aged=varAge')
select symbol, sum(price) from StockTickEvent group by timestamp

EPL supports the SQL-standard rollup, cube and grouping sets keywords. These keywords are available only in the group-by clause and instruct the engine to compute higher-level (or super-aggregate) aggregation values, i.e. to perform multiple levels of analysis (groupings) at the same time.

EPL also supports the SQL-standard grouping and grouping_id functions. These functions can be used in the select-clause, having-clause or order by-clause to obtain information about the current row's grouping level in expressions. Please see Section 10.1.7, “The Grouping Function”.

Detailed examples and information in respect to output rate limiting can be found in Section A.7, “Output for Fully-Aggregated, Grouped Queries With Rollup”.

Use the rollup keyword in the group-by lists of expressions to compute the equivalent of an OLAP dimension or hierarchy.

For example, the following statement outputs for each incoming event three rows. The first row contains the total volume per symbol and feed, the second row contains the total volume per symbol and the third row contains the total volume overall. This example aggregates across all events for each aggregation level (3 groupings) since it declares no data window:

select symbol, tickDataFeed, sum(volume) from StockTickEvent
group by rollup(symbol, tickDataFeed)

The value of tickDataFeed is null for the output row that contains the total per symbol and the output row that contains the total volume overall. The value of both symbol and tickDataFeed is null for the output row that contains the overall total.

Use the cube keyword in the group-by lists of expressions to compute a cross-tabulation.

The following statement outputs for each incoming event four rows. The first row contains the total volume per symbol and feed, the second row contains the total volume per symbol, the third row contains the total volume per feed and the forth row contains the total volume overall (4 groupings):

select symbol, tickDataFeed, sum(volume) from StockTickEvent
group by cube(symbol, tickDataFeed)

The grouping sets keywords allows you to specify only the groupings you want. It can thus be used to generate the same groupings that simple group-by expressions, rollup or cube would produce.

In this example each incoming event causes the engine to compute two output rows: The first row contains the total volume per symbol and the second row contains the total volume per feed (2 groupings):

select symbol, tickDataFeed, sum(volume) from StockTickEvent
group by grouping sets(symbol, feed)

Your group-by expression can list grouping expressions and use rollup, cube and grouping sets keywords in addition or in combination.

This statement outputs the total per combination of symbol and feed and the total per symbol (2 groupings):

select symbol, tickDataFeed, sum(volume) from StockTickEvent
group by symbol, rollup(tickDataFeed)

You can specify combinations of expressions by using parenthesis.

The next statement is equivalent and also outputs the total per symbol and feed and the total per symbol (2 groupings, note the parenthesis):

select symbol, tickDataFeed, sum(volume) from StockTickEvent
group by grouping sets ((symbol, tickDataFeed), symbol)

Use empty parenthesis to aggregate across all dimensions.

This statement outputs the total per symbol, the total per feed and the total overall (3 groupings):

select symbol, tickDataFeed, sum(volume) from StockTickEvent
group by grouping sets (symbol, tickDataFeed, ())

The order of any output events for both insert and remove stream data is well-defined and exactly as indicated before. For example, specifying grouping sets ((), symbol, tickDataFeed) outputs a total overall, a total by symbol and a total by feed in that order. If the statement has an order-by-clause then the ordering criteria of the order-by-clause take precedence.

You can use rollup and cube within grouping sets.

This statement outputs the total per symbol and feed, the total per symbol, the total overall and the total by feed (4 groupings):

select symbol, tickDataFeed, sum(volume) from StockTickEvent
group by grouping sets (rollup(symbol, tickDataFeed), tickDataFeed)

Note

In order to use any of the rollup, cube and grouping sets keywords the statement must be fully-aggregated. All non-aggregated properties in the select-clause, having-clause or order-by-clause must also be listed in the group by clause.

EPL allows each aggregation function to specify its own grouping criteria. This is useful for aggregating across multiple dimensions.

The syntax for the group_by parameter for use with aggregation functions is:

group_by: ( [expression [,expression [,...]]] )

The group_by identifier can occur at any place within the aggregation function parameters. It follows a colon and within parenthesis an optional list of grouping expressions. The parenthesis are not required when providing a single expression. For grouping on the top level (overall aggregation) please use () empty parenthesis.

The presence of group_by aggregation function parameters, the grouping expressions as well as the group-by clause determine the number of output rows for queries as further described in Section 3.7.2, “Output for Aggregation and Group-By”.

For un-grouped queries (without a group by clause), if any aggregation function specifies a group_by other than the () overall group, the query executes as aggregated and un-grouped.

For example, the next statement is an aggregated (but not fully aggregated) and ungrouped query and outputs various totals for each arriving event:

select sum(price, group_by:()) as totalPriceOverall,
  sum(price, group_by:account) as totalPricePerAccount,
  sum(price, group_by:(account, feed)) as totalPricePerAccountAndFeed
from Orders

For grouped queries (with a group by clause), if all aggregation functions specifiy either no group_by or group_by criteria that subsume the criteria in the group by clause, the query executes as a fully-aggregated and grouped query. Otherwise the query executes as an aggregated and grouped query.

The next example is fully-aggregated and grouped and it computes, for the last one minute of orders, the ratio of orders per account compared to all orders:

select count(*)/count(*, group_by:()) as ratio
from Orders.win:time(1 min) group by account

The next example is an aggregated (and not fully-aggregated) and grouped query that in addition outputs a count per order category:

select count(*) as cnt, count(*, group_by:()) as cntOverall,  count(*, group_by:(category))  as cntPerCategory
from Orders.win:time(1 min) group by account

Please note the following restrictions:

  1. Expressions in the group_by cannot contain aggregate functions.

  2. Hints pertaining to group-by are not available when a statement specifies aggregation functions with group_by.

  3. The group_by aggregation function parameters are not available in subqueries, match-recognize, statements that aggregate into tables using into table or in combination with rollup and grouping sets.

Use the having clause to pass or reject events defined by the group-by clause. The having clause sets conditions for the group by clause in the same way where sets conditions for the select clause, except where cannot include aggregate functions, while having often does.

This statement is an example of a having clause with an aggregate function. It posts the total price per symbol for the last 30 seconds of stock tick events for only those symbols in which the total price exceeds 1000. The having clause eliminates all symbols where the total price is equal or less then 1000.

select symbol, sum(price) 
from StockTickEvent.win:time(30 sec) 
group by symbol 
having sum(price) > 1000

To include more then one condition in the having clause combine the conditions with and, or or not. This is shown in the statement below which selects only groups with a total price greater then 1000 and an average volume less then 500.

select symbol, sum(price), avg(volume)
from StockTickEvent.win:time(30 sec) 
group by symbol 
having sum(price) > 1000 and avg(volume) < 500

A statement with the having clause should also have a group by clause. If you omit group-by, all the events not excluded by the where clause return as a single group. In that case having acts like a where except that having can have aggregate functions.

The having clause can also be used without group by clause as the below example shows. The example below posts events where the price is less then the current running average price of all stock tick events in the last 30 seconds.

select symbol, price, avg(price) 
from StockTickEvent.win:time(30 sec) 
having price < avg(price)

When you include filters, the where condition, the group by clause and the having condition in an EPL statement the sequence in which each clause affects events determines the final result:

The following query illustrates the use of filter, where, group by and having clauses in one statement with a select clause containing an aggregate function.

select tickDataFeed, stddev(price)
from StockTickEvent(symbol='IBM').win:length(10) 
where volume > 1000
group by tickDataFeed 
having stddev(price) > 0.8

Esper filters events using the filter criteria for the event stream StockTickEvent. In the example above only events with symbol IBM enter the length window over the last 10 events, all other events are simply discarded. The where clause removes any events posted by the length window (events entering the window and event leaving the window) that do not match the condition of volume greater then 1000. Remaining events are applied to the stddev standard deviation aggregate function for each tick data feed as specified in the group by clause. Each tickDataFeed value generates one event. Esper applies the having clause and only lets events pass for tickDataFeed groups with a standard deviation of price greater then 0.8.

The keyed segmented context create context ... partition by and the group by clause as well as the built-in std:groupwin view are similar in their ability to group events but very different in their semantics. This section explains the key differences in their behavior and use.

The keyed segmented context as declared with create context ... partition by and context .... select ... creates a new context partition per key value(s). The engine maintains separate data window views as well as separate aggregations per context partition; thereby the keyed segmented context applies to both. See Section 4.2.2, “Keyed Segmented Context” for additional examples.

The group by clause works together with aggregation functions in your statement to produce an aggregation result per group. In greater detail, this means that when a new event arrives, the engine applies the expressions in the group by clause to determine a grouping key. If the engine has not encountered that grouping key before (a new group), the engine creates a set of new aggregation results for that grouping key and performs the aggregation changing that new set of aggregation results. If the grouping key points to an existing set of prior aggregation results (an existing group), the engine performs the aggregation changing the prior set of aggregation results for that group.

The std:groupwin view is a built-in view that groups events into data windows. The view is described in greater detail in Section 13.3.2, “Grouped Data Window (std:groupwin)”. Its primary use is to create a separate data window per group, or more generally to create separate instances of all its sub-views for each grouping key encountered.

The table below summarizes the point:


Please review the performance section for advice related to performance or memory-use.

The next example shows queries that produce equivalent results. The query using the group by clause is generally preferable as is easier to read. The second form introduces the stat:uni view which computes univariate statistics for a given property:

select symbol, avg(price) from StockTickEvent group by symbol
// ... is equivalent to ...
select symbol, average from StockTickEvent.std:groupwin(symbol).stat:uni(price)

The next example shows two queries that are NOT equivalent as the length window is ungrouped in the first query, and grouped in the second query:

select symbol, sum(price) from StockTickEvent.win:length(10) group by symbol
// ... NOT equivalent to ...
select symbol, sum(price) from StockTickEvent.std:groupwin(symbol).win:length(10)

The key difference between the two statements is that in the first statement the length window is ungrouped and applies to all events regardless of group. While in the second query each group gets its own instance of a length window. For example, in the second query events arriving for symbol "ABC" get a length window of 10 events, and events arriving for symbol "DEF" get their own length window of 10 events.

The output clause is optional in Esper and is used to control or stabilize the rate at which events are output and to suppress output events. The EPL language provides for several different ways to control output rate.

Here is the syntax for the output clause that specifies a rate in time interval or number of events:

output [after suppression_def] 
  [[all | first | last | snapshot] every output_rate [seconds | events]]
[and when terminated]

An alternate syntax specifies the time period between output as outlined in Section 5.2.1, “Specifying Time Periods” :

output [after suppression_def] 
  [[all | first | last | snapshot] every time_period]
[and when terminated]

A crontab-like schedule can also be specified. The schedule parameters follow the pattern observer parameters and are further described in Section 7.6.4, “Crontab (timer:at)” :

output [after suppression_def] 
  [[all | first | last | snapshot] at 
   (minutes, hours, days of month, months, days of week [, seconds])]
[and when terminated]

For use with contexts, in order to trigger output only when a context partition terminates, specify when terminated as further described in Section 4.5, “Output When Context Partition Ends”:

output [after suppression_def] 
  [[all | first | last | snapshot] when terminated 
  [and termination_expression]
  [then set variable_name = assign_expression [, variable_name = assign_expression [,...]]]
  ]

Last, output can be controlled by an expression that may contain variables, user-defined functions and information about the number of collected events. Output that is controlled by an expression is discussed in detail below.

The after keyword and suppression_def can appear alone or together with further output conditions and suppresses output events.

For example, the following statement outputs, every 60 seconds, the total price for all orders in the 30-minute time window:

select sum(price) from OrderEvent.win:time(30 min) output snapshot every 60 seconds

The all keyword is the default and specifies that all events in a batch should be output, each incoming row in the batch producing an output row. Note that for statements that group via the group by clause, the all keyword provides special behavior as below.

The first keyword specifies that only the first event in an output batch is to be output. Using the first keyword instructs the engine to output the first matching event as soon as it arrives, and then ignores matching events for the time interval or number of events specified. After the time interval elapsed, or the number of matching events has been reached, the next first matching event is output again and the following interval the engine again ignores matching events. For statements that group via the group by clause, the first keywords provides special behavior as below.

The last keyword specifies to only output the last event at the end of the given time interval or after the given number of matching events have been accumulated. Again, for statements that group via the group by clause the last keyword provides special behavior as below.

The snapshot keyword is often used with unbound streams and/or aggregation to output current aggregation results. While the other keywords control how a batch of events between output intervals is being considered, the snapshot keyword outputs current state of a statement independent of the last batch. Its output is comparable to the iterator method provided by a statement. More information on output snapshot can be found in Section 5.7.1.3, “Output Snapshot”.

The output_rate is the frequency at which the engine outputs events. It can be specified in terms of time or number of events. The value can be a number to denote a fixed output rate, or the name of a variable whose value is the output rate. By means of a variable the output rate can be controlled externally and changed dynamically at runtime.

Please consult the Appendix A, Output Reference and Samples for detailed information on insert and remove stream output for the various output clause keywords.

For use with contexts you may append the keywords and when terminated to trigger output at the rate defined and in addition trigger output when the context partition terminates. Please see Section 4.5, “Output When Context Partition Ends” for details.

The time interval can also be specified in terms of minutes; the following statement is identical to the first one.

select * from StockTickEvent output every 1.5 minutes

A second way that output can be stabilized is by batching events until a certain number of events have been collected:

select * from StockTickEvent output every 5 events

Additionally, event output can be further modified by the optional last keyword, which causes output of only the last event to arrive into an output batch.

select * from StockTickEvent output last every 5 events

Using the first keyword you can be notified at the start of the interval. The allows to watch for situations such as a rate falling below a threshold and only be informed every now and again after the specified output interval, but be informed the moment it first happens.

select * from TickRate where rate<100 output first every 60 seconds

A sample statement using the Unix "crontab"-command schedule is shown next. See Section 7.6.4, “Crontab (timer:at)” for details on schedule syntax. Here, output occurs every 15 minutes from 8am to 5:45pm (hours 8 to 17 at 0, 15, 30 and 45 minutes past the hour):

select symbol, sum(price) from StockTickEvent group by symbol output at (*/15, 8:17, *, *, *)

Output can also be controlled by an expression that may check variable values, use user-defined functions and query built-in properties that provide additional information. The synopsis is as follows:

output [after suppression_def] 
  [[all | first | last | snapshot] when trigger_expression 
    [then set variable_name = assign_expression [, variable_name = assign_expression [,...]]]
  [and when terminated 
    [and termination_expression]
    [then set variable_name = assign_expression [, variable_name = assign_expression [,...]]]
  ]

The when keyword must be followed by a trigger expression returning a boolean value of true or false, indicating whether to output. Use the optional then keyword to change variable values after the trigger expression evaluates to true. An assignment expression assigns a new value to variable(s).

For use with contexts you may append the keywords and when terminated to also trigger output when the context partition terminates. Please see Section 4.5, “Output When Context Partition Ends” for details. You may optionally specify a termination expression. If that expression is provided the engine evaluates the expression when the context partition terminates: The evaluation result of true means output occurs when the context partition terminates, false means no output occurs when the context partition terminates. You may specify then set followed by a list of assignments to assign variables. Assignments are executed on context partition termination regardless of the termination expression, if present.

Lets consider an example. The next statement assumes that your application has defined a variable by name OutputTriggerVar of boolean type. The statement outputs rows only when the OutputTriggerVar variable has a boolean value of true:

select sum(price) from StockTickEvent output when OutputTriggerVar = true

The engine evaluates the trigger expression when streams and data views post one or more insert or remove stream events after considering the where clause, if present. It also evaluates the trigger expression when any of the variables used in the trigger expression, if any, changes value. Thus output occurs as follows:

  1. When there are insert or remove stream events and the when trigger expression evaluates to true, the engine outputs the resulting rows.

  2. When any of the variables in the when trigger expression changes value, the engine evaluates the expression and outputs results. Result output occurs within the minimum time interval of timer resolution (100 milliseconds).

By adding a then part to the EPL, we can reset any variables after the trigger expression evaluated to true:

select sum(price) from StockTickEvent 
  output when OutputTriggerVar = true  
  then set OutputTriggerVar = false

Expressions in the when and then may, for example, use variables, user defined functions or any of the built-in named properties that are described in the below list.

The following built-in properties are available for use:


The values provided by count_insert and count_remove are non-continues: The number returned for these properties may 'jump' up rather then count up by 1. The counts reset to zero upon output.

The following restrictions apply to expressions used in the output rate clause:

  • Event property names cannot be used in the output clause.

  • Aggregation functions cannot be used in the output clause.

  • The prev previous event function and the prior prior event function cannot be used in the output clause.

Remove stream events can also be useful in conjunction with aggregation and the output clause: When the engine posts remove stream events for fully-aggregated queries, it presents the aggregation state before the expiring event leaves the data window. Your application can thus easily obtain a delta between the new aggregation value and the prior aggregation value.

The engine evaluates the having-clause at the granularity of the data posted by views. That is, if you utilize a time window and output every 10 events, the having clause applies to each individual event or events entering and leaving the time window (and not once per batch of 10 events).

The output clause interacts in two ways with the group by and having clauses. First, in the output every n events case, the number n refers to the number of events arriving into the group by clause. That is, if the group by clause outputs only 1 event per group, or if the arriving events don't satisfy the having clause, then the actual number of events output by the statement could be fewer than n.

Second, the last, all and first keywords have special meanings when used in a statement with aggregate functions and the group by clause:

Please consult the Appendix A, Output Reference and Samples for detailed information on insert and remove stream output for aggregation and group-by.

By adding an output rate limiting clause to a statement that contains a group by clause we can control output of groups to obtain one row for each group, generating an event per group at the given output frequency.

The next statement outputs total price per symbol cumulatively (no data window was used here). As it specifies the all keyword, the statement outputs the current value for all groups seen so far, regardless of whether the group was updated in the last interval. Output occurs after an interval of 5 seconds passed and at the end of each subsequent interval:

select symbol, sum(price) from StockTickEvent group by symbol output all every 5 seconds

The below statement outputs total price per symbol considering events in the last 3 minutes. When events leave the 3-minute data window output also occurs as new aggregation values are computed. The last keyword instructs the engine to output only those groups that had changes. Output occurs after an interval of 10 seconds passed and at the end of each subsequent interval:

select symbol, sum(price) from StockTickEvent.win:time(3 min)
group by symbol output last every 10 seconds

This statement also outputs total price per symbol considering events in the last 3 minutes. The first keyword instructs the engine to output as soon as there is a new value for a group. After output for a given group the engine suppresses output for the same group for 10 seconds and does not suppress output for other groups. Output occurs again for that group after the interval when the group has new value(s):

select symbol, sum(price) from StockTickEvent.win:time(3 min)
group by symbol output first every 10 seconds

The order by clause is optional. It is used for ordering output events by their properties, or by expressions involving those properties. .

For example, the following statement outputs batches of 5 or more stock tick events that are sorted first by price ascending and then by volume ascending:

select symbol from StockTickEvent.win:time(60 sec) 
output every 5 events 
order by price, volume

Here is the syntax for the order by clause:

order by expression [asc | desc] [, expression [asc | desc]] [, ...]

If the order by clause is absent then the engine still makes certain guarantees about the ordering of output:

  • If the statement is not a join, does not group via group by clause and does not declare grouped data windows via std:groupwin view, the order in which events are delivered to listeners and through the iterator pull API is the order of event arrival.

  • If the statement is a join or outer join, or groups, then the order in which events are delivered to listeners and through the iterator pull API is not well-defined. Use the order by clause if your application requires events to be delivered in a well-defined order.

Esper places the following restrictions on the expressions in the order by clause:

  1. All aggregate functions that appear in the order by clause must also appear in the select expression.

Otherwise, any kind of expression that can appear in the select clause, as well as any name defined in the select clause, is also valid in the order by clause.

By default all sort operations on string values are performed via the compare method and are thus not locale dependent. To account for differences in language or locale, see Section 16.4.21, “Engine Settings related to Language and Locale” to change this setting.

The limit clause is typically used together with the order by and output clause to limit your query results to those that fall within a specified range. You can use it to receive the first given number of result rows, or to receive a range of result rows.

There are two syntaxes for the limit clause, each can be parameterized by integer constants or by variable names. The first syntax is shown below:

limit row_count [offset offset_count]

The required row_count parameter specifies the number of rows to output. The row_count can be an integer constant and can also be the name of the integer-type variable to evaluate at runtime.

The optional offset_count parameter specifies the number of rows that should be skipped (offset) at the beginning of the result set. A variable can also be used for this parameter.

The next sample EPL query outputs the top 10 counts per property 'uri' every 1 minute.

select uri, count(*) from WebEvent 
group by uri 
output snapshot every 1 minute
order by count(*) desc 
limit 10

The next statement demonstrates the use of the offset keyword. It outputs ranks 3 to 10 per property 'uri' every 1 minute:

select uri, count(*) from WebEvent 
group by uri 
output snapshot every 1 minute
order by count(*) desc 
limit 8 offset 2

The second syntax for the limit clause is for SQL standard compatibility and specifies the offset first, followed by the row count:

limit offset_count[, row_count]

The following are equivalent:

limit 8 offset 2
// ...equivalent to
limit 2, 8

A negative value for row_count returns an unlimited number or rows, and a zero value returns no rows. If variables are used, then the current variable value at the time of output dictates the row count and offset. A variable returning a null value for row_count also returns an unlimited number or rows.

A negative value for offset is not allowed. If your variable returns a negative or null value for offset then the value is assumed to be zero (i.e. no offset).

The iterator pull API also honors the limit clause, if present.

The insert into clause is optional in Esper. The clause can be specified to make the results of a statement available as an event stream for use in further statements, or to insert events into a named window or table. The clause can also be used to merge multiple event streams to form a single stream of events.

The syntax for the insert into clause is as follows:

insert [istream | irstream | rstream] into event_stream_name  [ (property_name [, property_name] ) ]

The istream (default) and rstream keywords are optional. If no keyword or the istream keyword is specified, the engine supplies the insert stream events generated by the statement. The insert stream consists of the events entering the respective window(s) or stream(s). If the rstream keyword is specified, the engine supplies the remove stream events generated by the statement. The remove stream consists of the events leaving the respective window(s).

If your application specifies irstream, the engine inserts into the new stream both the insert and remove stream. This is often useful in connection with the istream built-in function that returns an inserted/removed boolean indicator for each event, see Section 10.1.10, “The Istream Function”.

The event_stream_name is an identifier that names the event stream (and also implicitly names the types of events in the stream) generated by the engine. It may also specify a named window name or a table name. The identifier can be used in further statements to filter and process events of that event stream, unless inserting into a table. The insert into clause can consist of just an event stream name, or an event stream name and one or more property names.

The engine also allows listeners to be attached to a statement that contain an insert into clause. Listeners receive all events posted to the event stream.

To merge event streams, simply use the same event_stream_name identifier in all EPL statements that merge their result event streams. Make sure to use the same number and names of event properties and event property types match up.

Esper places the following restrictions on the insert into clause:

  1. The number of elements in the select clause must match the number of elements in the insert into clause if the clause specifies a list of event property names

  2. If the event stream name has already been defined by a prior statement or configuration, and the event property names and/or event types do not match, an exception is thrown at statement creation time.

The following sample inserts into an event stream by name CombinedEvent:

insert into CombinedEvent
select A.customerId as custId, A.timestamp - B.timestamp as latency
  from EventA.win:time(30 min) A, EventB.win:time(30 min) B
 where A.txnId = B.txnId

Each event in the CombinedEvent event stream has two event properties named "custId" and "latency". The events generated by the above statement can be used in further statements, such as shown in the next statement:

select custId, sum(latency)
  from CombinedEvent.win:time(30 min)
 group by custId

The example statement below shows the alternative form of the insert into clause that explicitly defines the property names to use.

insert into CombinedEvent (custId, latency)
select A.customerId, A.timestamp - B.timestamp 
...

The rstream keyword can be useful to indicate to the engine to generate only remove stream events. This can be useful if we want to trigger actions when events leave a window rather then when events enter a window. The statement below generates CombinedEvent events when EventA and EventB leave the window after 30 minutes.

insert rstream into CombinedEvent
select A.customerId as custId, A.timestamp - B.timestamp as latency
  from EventA.win:time(30 min) A, EventB.win:time(30 min) B
 where A.txnId = B.txnId

The insert into clause can be used in connection with patterns to provide pattern results to further statements for analysis:

insert into ReUpEvent
select linkUp.ip as ip 
from pattern [every linkDown=LinkDownEvent -> linkUp=LinkUpEvent(ip=linkDown.ip)]

The insert into clause allows to merge multiple event streams into a event single stream. The clause names an event stream to insert into by specifing an event_stream_name. The first statement that inserts into the named stream defines the stream's event types. Further statements that insert into the same event stream must match the type of events inserted into the stream as declared by the first statement.

One approach to merging event streams specifies individual colum names either in the select clause or in the insert into clause of the statement. This approach has been shown in earlier examples.

Another approach to merging event streams specifies the wildcard (*) in the select clause (or the stream wildcard) to select the underlying event. The events in the event stream must then have the same event type as generated by the from clause.

Assume a statement creates an event stream named MergedStream by selecting OrderEvent events:

insert into MergedStream select * from OrderEvent

A statement can use the stream wildcard selector to select only OrderEvent events in a join:

insert into MergedStream select ord.* from ItemScanEvent, OrderEvent as ord

And a statement may also use an application-supplied user-defined function to convert events to OrderEvent instances:

insert into MergedStream select MyLib.convert(item) from ItemScanEvent as item

Esper specifically recognizes a conversion function as follows: A conversion function must be the only selected column, and it must return either a Java object or java.util.Map or Object[] (object array). Your EPL should not use the as keyword to assign a column name.

A variant stream is a predefined stream into which events of multiple disparate event types can be inserted.

A variant stream name may appear anywhere in a pattern or from clause. In a pattern, a filter against a variant stream matches any events of any of the event types inserted into the variant stream. In a from clause including for named windows, views declared onto a variant stream may hold events of any of the event types inserted into the variant stream.

A variant stream is thus useful in problems that require different types of event to be treated the same.

Variant streams can be declared by means of create variant schema or can be predefined via runtime or initialization-time configuration as described in Section 16.4.27, “Variant Stream”. Your application may declare or predefine variant streams to carry events of a limited set of event types, or you may choose the variant stream to carry any and all types of events. This choice affects what event properties are available for consuming statements or patterns of the variant stream.

Assume that an application predefined a variant stream named OrderStream to carry only ServiceOrder and ProductOrder events. An insert into clause inserts events into the variant stream:

insert into OrderStream select * from ServiceOrder
insert into OrderStream select * from ProductOrder

Here is a sample statement that consumes the variant stream and outputs a total price per customer id for the last 30 seconds of ServiceOrder and ProductOrder events:

select customerId, sum(price) from OrderStream.win:time(30 sec) group by customerId

If your application predefines the variant stream to hold specific type of events, as the sample above did, then all event properties that are common to all specified types are visible on the variant stream, including nested, indexed and mapped properties. For access to properties that are only available on one of the types, the dynamic property syntax must be used. In the example above, the customerId and price were properties common to both ServiceOrder and ProductOrder events.

For example, here is a consuming statement that selects a service duraction property that only ServiceOrder events have, and that must therefore be casted to double and null values removed in order to aggregate:

select customerId, sum(coalesce(cast(serviceDuraction?, double), 0)) 
from OrderStream.win:time(30 sec) group by customerId

If your application predefines a variant stream to hold any type of events (the any type variance), then all event properties of the variant stream are effectively dynamic properties.

For example, an application may define an OutgoingEvents variant stream to hold any type of event. The next statement is a sample consumer of the OutgoingEvents variant stream that looks for the destination property and fires for each event in which the property exists with a value of 'email':

select * from OutgoingEvents(destination = 'email')

When you declare the inserted-into event type in advance to the statement that inserts, the engine compares the inserted-into event type information to the return type of expressions in the select-clause. The comparison uses the column alias assigned to each select-clause expression using the as keyword.

When the inserted-into column type is an event type and when using a subquery or the new operator, the engine compares column names assigned to subquery columns or new operator columns.

For example, assume a PurchaseOrder event type that has a property called items that consists of Item rows:

create schema Item(name string, price double)
create schema PurchaseOrder(orderId string, items Item[])

Declare a statement that inserts into the PurchaseOrder stream:

insert into PurchaseOrder 
select '001' as orderId, new {name='i1', price=10} as items
from TriggerEvent

The alias assigned to the first and second expression in the select-clause, namely orderId and items, both match the event property names of the Purchase Order event type. The column names provided to the new operator also both match the event property names of the Item event type.

When the event type declares the column as a single value (and not an array) and when the select-clause expression produces a multiple rows, the engine only populate the first row.

Consider a PurchaseOrder event type that has a property called item that consists of a single Item event:

create schema PurchaseOrder(orderId string, items Item)

The sample subquery below populates only the very first event, discarding remaining subquery result events, since the items property above is declared as holding a single Item-typed event only (versus Item[] to hold multiple Item-typed events).

insert into PurchaseOrder select 
(select 'i1' as name, 10 as price from HistoryEvent.win:length(2)) as items 
from TriggerEvent

Consider using a subquery with filter, or one of the enumeration methods to select a specific subquery result row.

A subquery is a select within another statement. Esper supports subqueries in the select clause, where clause, having clause and in stream and pattern filter expressions. Subqueries provide an alternative way to perform operations that would otherwise require complex joins. Subqueries can also make statements more readable then complex joins.

Esper supports both simple subqueries as well as correlated subqueries. In a simple subquery, the inner query is not correlated to the outer query. Here is an example simple subquery within a select clause:

select assetId, (select zone from ZoneClosed.std:lastevent()) as lastClosed from RFIDEvent

If the inner query is dependent on the outer query, we will have a correlated subquery. An example of a correlated subquery is shown below. Notice the where clause in the inner query, where the condition involves a stream from the outer query:

select * from RfidEvent as RFID where 'Dock 1' = 
  (select name from Zones.std:unique(zoneId) where zoneId = RFID.zoneId)

The example above shows a subquery in the where clause. The statement selects RFID events in which the zone name matches a string constant based on zone id. The statement uses the view std:unique to guarantee that only the last event per zone id is held from processing by the subquery.

The next example is a correlated subquery within a select clause. In this statement the select clause retrieves the zone name by means of a subquery against the Zones set of events correlated by zone id:

select zoneId, (select name from Zones.std:unique(zoneId) 
  where zoneId = RFID.zoneId) as name from RFIDEvent

Note that when a simple or correlated subquery returns multiple rows, the engine returns a null value as the subquery result. To limit the number of events returned by a subquery consider using one of the views std:lastevent, std:unique and std:groupwin or aggregation functions or the multi-row and multi-column selects as described below.

The select clause of a subquery also allows wildcard selects, which return as an event property the underlying event object of the event type as defined in the from clause. An example:

select (select * from MarketData.std:lastevent()) as md 
  from pattern [every timer:interval(10 sec)]

The output events to the statement above contain the underlying MarketData event in a property named "md". The statement populates the last MarketData event into a property named "md" every 10 seconds following the pattern definition, or populates a null value if no MarketData event has been encountered so far.

Aggregation functions may be used in the select clause of the subselect as this example outlines:

select * from MarketData
where price > (select max(price) from MarketData(symbol='GOOG').std:lastevent())

As the sub-select expression is evaluated first (by default), the query above actually never fires for the GOOG symbol, only for other symbols that have a price higher then the current maximum for GOOG. As a sidenote, the insert into clause can also be handy to compute aggregation results for use in multiple subqueries.

When using aggregation functions in a correlated subselect the engine computes the aggregation based on data window (if provided), named window or table contents matching the where-clause.

The following example compares the quantity value provided by the current order event against the total quantity of all order events in the last 1 hour for the same client.

select * from OrderEvent oe
where qty > 
  (select sum(qty) from OrderEvent.win:time(1 hour) pd 
  where pd.client = oe.client)

Filter expressions in a pattern or stream may also employ subqueries. Subqueries can be uncorrelated or can be correlated to properties of the stream or to properties of tagged events in a pattern. Subqueries may reference named windows and tables as well.

The following example filters BarData events that have a close price less then the last moving average (field movAgv) as provided by stream SMA20Stream (an uncorrelated subquery):

select * from BarData(ticker='MSFT', closePrice < 
    (select movAgv from SMA20Stream(ticker='MSFT').std:lastevent()))

A few generic examples follow to demonstrate the point. The examples use short event and property names so they are easy to read. Assume A and B are streams and DNamedWindow is a named window, and ETable is a table and properties a_id, b_id, d_id, e_id, a_val, b_val, d_val, e_val respectively:

// Sample correlated subquery as part of stream filter criteria
select * from A(a_val in 
  (select b_val from B.std:unique(b_val) as b where a.a_id = b.b_id)) as a
// Sample correlated subquery against a named window
select * from A(a_val in 
  (select d_val from DNamedWindow as d where a.a_id = d.d_id)) as a
// Sample correlated subquery in the filter criteria as part of a pattern, querying a named window
select * from pattern [
  a=A -> b=B(bvalue = 
    (select d_val from DNamedWindow as d where d.d_id = b.b_id and d.d_id = a.a_id))
]
// Sample correlated subquery against a table
select * from A(a_val in 
  (select e_val from ETable as e where a.a_id = e.e_id)) as a

Subquery state starts to accumulate as soon as a statement starts (and not only when a pattern-subexpression activates).

The following restrictions apply to subqueries:

  1. Subqueries can only consist of a select clause, a from clause, a where clause and a group by clause. The having clause, as well as joins, outer-joins and output rate limiting are not permitted within subqueries.

  2. If using aggregation functions in a subquery, note these limitations:

    1. None of the properties of the correlated stream(s) can be used within aggregation functions.

    2. The properties of the subselect stream must all be within aggregation functions.

  3. With the exception of subqueries against named windows and tables and subqueries that are both uncorrelated and fully-aggregated, the subquery stream definition must define a data window to limit subquery results, for the purpose of identifying the events held for subquery execution.

The order of evaluation of subqueries relative to the containing statement is guaranteed: If the containing statement and its subqueries are reacting to the same type of event, the subquery will receive the event first before the containing statement's clauses are evaluated. This behavior can be changed via configuration. The order of evaluation of subqueries is not guaranteed between subqueries.

Performance of your statement containing one or more subqueries principally depends on two parameters. First, if your subquery correlates one or more columns in the subquery stream with the enclosing statement's streams, the engine automatically builds the appropriate indexes for fast row retrieval based on the key values correlated (joined). The second parameter is the number of rows found in the subquery stream and the complexity of the filter criteria (where clause), as each row in the subquery stream must evaluate against the where clause filter.

The any subquery condition is true if the expression returns true for one or more of the values returned by the subquery.

The synopsis for the any keyword is as follows:

expression operator any (subquery)
expression operator some (subquery)

The left-hand expression is evaluated and compared to each row of the subquery result using the given operator, which must yield a Boolean result. The result of any is "true" if any true result is obtained. The result is "false" if no true result is found (including the special case where the subquery returns no rows).

The operator can be any of the following values: =, !=, <>, <, <=, >, >=.

The some keyword is a synonym for any. The in construct is equivalent to = any.

The right-hand side subquery must return exactly one column.

The next statement demonstrates the use of the any subquery condition:

select * from ProductOrder as ord
  where quantity < any
    (select minimumQuantity from MinimumQuantity.win:keepall())

The above query compares ProductOrder event's quantity value with all rows from the MinimumQuantity stream of events and returns only those ProductOrder events that have a quantity that is less then any of the minimum quantity values of the MinimumQuantity events.

Note that if there are no successes and at least one right-hand row yields null for the operator's result, the result of the any construct will be null, not false. This is in accordance with SQL's normal rules for Boolean combinations of null values.

Your subquery may select multiple columns in the select clause including multiple aggregated values from a data window or named window or table.

The following example is a correlated subquery that selects wildcard and in addition selects the bid and offer properties of the last MarketData event for the same symbol as the arriving OrderEvent:

select *,
  (select bid, offer from MarketData.std:unique(symbol) as md 
   where md.symbol = oe.symbol) as bidoffer
from OrderEvent oe

Output events for the above query contain all properties of the original OrderEvent event. In addition each output event contains a bidoffer nested property that itself contains the bid and offer properties. You may retrieve the bid and offer from output events directly via the bidoffer.bid property name syntax for nested properties.

The next example is similar to the above query but instead selects aggregations and selects from a named window by name OrderNamedWindow (creation not shown here). For each arriving OrderEvent it selects the total quantity and count of all order events for the same client, as currently held by the named window:

select *,
  (select sum(qty) as sumPrice, count(*) as countRows 
   from OrderNamedWindow as onw
   where onw.client = oe.client) as pastOrderTotals
from OrderEvent as oe

The next EPL statement computes a prorated quantity considering the maximum and minimum quantity for the last 1 minute of order events:

expression subq {
  (select max(quantity) as maxq, min(quantity) as minq from OrderEvent.win:time(1 min))
}
select (quantity - minq) / (subq().maxq  - subq().minq) as prorated
from OrderEvent

Output events for the above query contain all properties of the original OrderEvent event. In addition each output event contains a pastOrderTotals nested property that itself contains the sumPrice and countRows properties.

While a subquery cannot change the cardinality of the selected stream, a subquery can return multiple values from the selected data window or named window or table. This section shows examples of the window aggregation function as well as the use of enumeration methods with subselects.

Consider using an inner join, outer join or unidirectional join instead to achieve a 1-to-many cardinality in the number of output events.

The next example is an uncorrelated subquery that selects all current ZoneEvent events considering the last ZoneEvent per zone for each arriving RFIDEvent.

select assetId,
 (select window(z.*) as winzones from ZoneEvent.std:unique(zone) as z) as zones
 from RFIDEvent

Output events for the above query contain two properties: the assetId property and the zones property. The latter property is a nested property that contains the winzones property. You may retrieve the zones from output events directly via the zones.winzones property name syntax for nested properties.

In this example for a correlated subquery against a named window we assume that the OrderNamedWindow has been created and contains order events. The query returns for each MarketData event the list of order ids for orders with the same symbol:

select price,
 (select window(orderId) as winorders 
  from OrderNamedWindow onw 
  where onw.symbol = md.symbol) as orderIds
 from MarketData md

Output events for the above query contain two properties: the price property and the orderIds property. The latter property is a nested property that contains the winorders property of type array.

Another option to reduce selected rows to a single value is through the use of enumeration methods.

select price,
 (select *  from OrderNamedWindow onw
  where onw.symbol = md.symbol).selectFrom(v => v) as ordersSymbol
 from MarketData md

Output events for the above query also contain a Collection of underlying events in the ordersSymbol property.

The following hints are available to tune performance and memory use of subqueries.

Use the @Hint('set_noindex') hint for a statement that utilizes one or more subqueries. It instructs the engine to always perform a full scan. The engine does not build an implicit index or use an explicitly-created index when this hint is provided. Use of the hint may result in reduced memory use but poor statement performance.

The following hints are available to tune performance and memory use of subqueries that select from named windows (does not apply to tables).

Named windows are globally-visible data windows. As such an application may create explicit indexes as discussed in Section 6.9, “Explicitly Indexing Named Windows and Tables”. The engine may also elect to create implicit indexes (no create-index EPL required) for index-based lookup of rows when executing on-select, on-merge, on-update and on-delete statements and for statements that subquery a named window.

By default and without specifying a hint, each statement that subqueries a named window also maintains its own index for looking up events held by the named window. The engine maintains the index by consuming the named window insert and remove stream. When the statement is destroyed it releases that index.

Specify the @Hint('enable_window_subquery_indexshare') hint to enable subquery index sharing for named windows. When using this hint, indexes for subqueries are maintained by the named window itself (and not each statement context partition), are shared between one or more statements and may also utilize explicit indexes. Specify the hint once as part of the create window statement.

This sample EPL statement creates a named window with subquery index sharing enabled:

@Hint('enable_window_subquery_indexshare')
create window OrdersNamedWindow.win:keepall() as OrderMapEventType

When subquery index sharing is enabled, performance may increase as named window stream consumption is no longer needed for correlated subqueries. You may also expect reduced memory use especially if a large number of EPL statements perform similar subqueries against a named window. Subquery index sharing may require additional short-lived object creation and may slightly increase lock held time for named windows.

The following statement performs a correlated subquery against the named window above. When a settlement event arrives it select the order detail for the same order id as provided by the settlement event:

select 
  (select * from OrdersNamedWindow as onw 
    where onw.orderId = se.orderId) as orderDetail
  from SettlementEvent as se

With subquery index sharing enabled the engine maintains an index of order events by order id for the named window, and shares that index between additional statements until the time all utilizing statements are destroyed.

You may disable subquery index sharing for a specific statement by specifying the @Hint('disable_window_subquery_indexshare') hint, as this example shows, causing the statement to maintain its own index:

@Hint('disable_window_subquery_indexshare')
select 
  (select * from OrdersNamedWindow as onw 
    where onw.orderId = se.orderId) as orderDetail
  from SettlementEvent as se

Two or more event streams can be part of the from-clause and thus both (all) streams determine the resulting events. This section summarizes the important concepts. The sections that follow present more detail on each topic.

The default join is an inner join which produces output events only when there is at least one match in all streams.

Consider the sample statement shown next:

select * from TickEvent.std:lastevent(), NewsEvent.std:lastevent()

The above statement outputs the last TickEvent and the last NewsEvent in one output event when either a TickEvent or a NewsEvent arrives. If no TickEvent was received before a NewsEvent arrives, no output occurs. Similarly when no NewsEvent was received before a TickEvent arrives, no output occurs.

The where-clause lists the join conditions that Esper uses to relate events in the two or more streams.

The next example statement retains the last TickEvent and last NewsEvent per symbol, and joins the two streams based on their symbol value:

select * from TickEvent.std:unique(symbol) as t, NewsEvent.std:unique(symbol) as n
where t.symbol = n.symbol

As before, when a TickEvent arrives for a symbol that has no matching NewsEvent then there is no output event.

An outer join does not require each event in either stream to have a matching event. The full outer join is useful when output is desired when no match is found. The different outer join types (full, left, right) are explained in more detail below.

This example statement is an outer-join and also returns the last TickEvent and last NewsEvent per symbol:

select * from TickEvent.std:unique(symbol) as t
full outer join NewsEvent.std:unique(symbol) as n on t.symbol = n.symbol

In the sample statement above, when a TickEvent arrives for a symbol that has no matching NewsEvent, or when a NewsEvent arrives for a symbol that has no matching TickEvent, the statement still produces an output event with a null column value for the missing event.

Note that each of the sample queries above defines a data window. The sample queries above use the last-event data window (std:lastevent) or the unique data window (std:unique). A data window serves to indicate the subset of events to join from each stream and may be required depending on the join.

In above queries, when either a TickEvent arrives or when a NewsEvent arrives then the query evaluates and there is output. The same holds true if additional streams are added to the from-clause: Each of the streams in the from-clause trigger the join to evaluate.

The unidirectional keyword instructs the engine to evaluate the join only when an event arrives from the single stream that was marked with the unidirectional keyword. In this case no data window should be specified for the stream marked as unidirectional since the keyword implies that the current event of that stream triggers the join.

Here is the sample statement above with unidirectional keyword, so that output occurs only when a TickEvent arrives and not when a NewsEvent arrives:

select * from TickEvent as t unidirectional, NewsEvent.std:unique(symbol) as n 
where t.symbol = n.symbol

It is oftentimes the case that an aggregation (count, sum, average) only needs to be calculated in the context of an arriving event or timer. Consider using the unidirectional keyword when aggregating over joined streams.

An EPL pattern is a normal citizen also providing a stream of data consisting of pattern matches. A time pattern, for example, can be useful to evaluate a join and produce output upon each interval.

This sample statement includes a pattern that fires every 5 seconds and thus triggers the join to evaluate and produce output, computing an aggregated total quantity per symbol every 5 seconds:

select symbol, sum(qty) from pattern[every timer:interval(5 sec)] unidirectional, 
  TickEvent.std:unique(symbol) t, NewsEvent.std:unique(symbol) as n 
where t.symbol = n.symbol group by symbol

Named windows as well as reference and historical data such as stored in your relational database, and data returned by a method invocation, can also be included in joins as discussed in Section 5.13, “Accessing Relational Data via SQL” and Section 5.14, “Accessing Non-Relational Data via Method Invocation”.

Related to joins are subqueries: A subquery is a select within another statement, see Section 5.11, “Subqueries”

The engine performs extensive query analysis and planning, building internal indexes and strategies as required to allow fast evaluation of many types of queries.

Each point in time that an event arrives to one of the event streams, the two event streams are joined and output events are produced according to the where clause when matching events are found for all joined streams.

This example joins 2 event streams. The first event stream consists of fraud warning events for which we keep the last 30 minutes. The second stream is withdrawal events for which we consider the last 30 seconds. The streams are joined on account number.

select fraud.accountNumber as accntNum, fraud.warning as warn, withdraw.amount as amount,
       max(fraud.timestamp, withdraw.timestamp) as timestamp, 'withdrawlFraud' as desc
  from com.espertech.esper.example.atm.FraudWarningEvent.win:time(30 min) as fraud,
       com.espertech.esper.example.atm.WithdrawalEvent.win:time(30 sec) as withdraw
 where fraud.accountNumber = withdraw.accountNumber

Joins can also include one or more pattern statements as the next example shows:

select * from FraudWarningEvent.win:time(30 min) as fraud,
    pattern [every w=WithdrawalEvent -> PINChangeEvent(acct=w.acct)].std:lastevent() as withdraw
 where fraud.accountNumber = withdraw.w.accountNumber

The statement above joins the last 30 minutes of fraud warnings with a pattern. The pattern consists of every withdrawal event that is followed by a PIN change event for the same account number. It joins the two event streams on account number. The last-event view instucts the join to only consider the last pattern match.

In a join and outer join, your statement must declare a data window view or other view onto each stream. Streams that are marked as unidirectional and named windows and tables as well as database or methods in a join are an exception and do not require a view to be specified. If you are joining an event to itself via contained-event selection, views also do not need to be specified. The reason that a data window must be declared is that a data window specifies which events are considered for the join (i.e. last event, last 10 events, all events, last 1 second of events etc.).

The next example joins all FraudWarningEvent events that arrived since the statement was started, with the last 20 seconds of PINChangeEvent events:

select * from FraudWarningEvent.win:keepall() as fraud, PINChangeEvent.win:time(20 sec) as pin
 where fraud.accountNumber = pin.accountNumber

The above example employed the special keep-all view that retains all events.

Esper supports left outer joins, right outer joins, full outer joins and inner joins in any combination between an unlimited number of event streams. Outer and inner joins can also join reference and historical data as explained in Section 5.13, “Accessing Relational Data via SQL”, as well as join data returned by a method invocation as outlined in Section 5.14, “Accessing Non-Relational Data via Method Invocation”.

The keywords left, right, full and inner control the type of the join between two streams. The optional on clause specifies one or more properties that join each stream. The synopsis is as follows:

...from stream_def [as name] 
  ((left|right|full outer) | inner) join stream_def 
  [on property = property [and property = property ...] ]
  [ ((left|right|full outer) | inner) join stream_def [on ...]]...

If the outer join is a left outer join, there will be an output event for each event of the stream on the left-hand side of the clause. For example, in the left outer join shown below we will get output for each event in the stream RfidEvent, even if the event does not match any event in the event stream OrderList.

select * from RfidEvent.win:time(30 sec) as rfid
       left outer join
       OrderList.win:length(10000) as orderlist
     on rfid.itemId = orderList.itemId

Similarly, if the join is a Right Outer Join, then there will be an output event for each event of the stream on the right-hand side of the clause. For example, in the right outer join shown below we will get output for each event in the stream OrderList, even if the event does not match any event in the event stream RfidEvent.

select * from RfidEvent.win:time(30 sec) as rfid
       right outer join
       OrderList.win:length(10000) as orderlist
       on rfid.itemId = orderList.itemId

For all types of outer joins, if the join condition is not met, the select list is computed with the event properties of the arrived event while all other event properties are considered to be null.

The next type of outer join is a full outer join. In a full outer join, each point in time that an event arrives to one of the event streams, one or more output events are produced. In the example below, when either an RfidEvent or an OrderList event arrive, one or more output event is produced. The next example shows a full outer join that joins on multiple properties:

select * from RfidEvent.win:time(30 sec) as rfid
       full outer join
       OrderList.win:length(10000) as orderlist
       on rfid.itemId = orderList.itemId and rfid.assetId = orderList.assetId

The last type of join is an inner join. In an inner join, the engine produces an output event for each event of the stream on the left-hand side that matches at least one event on the right hand side considering the join properties. For example, in the inner join shown below we will get output for each event in the RfidEvent stream that matches one or more events in the OrderList data window:

select * from RfidEvent.win:time(30 sec) as rfid
       inner join
       OrderList.win:length(10000) as orderlist
       on rfid.itemId = orderList.itemId and rfid.assetId = orderList.assetId

Patterns as streams in a join follow this rule: If no data window view is declared for the pattern then the pattern stream retains the last match. Thus a pattern must have matched at least once for the last row to become available in a join. Multiple rows from a pattern stream may be retained by declaring a data window view onto a pattern using the pattern [...].view_specification syntax.

This example outer joins multiple streams. Here the RfidEvent stream is outer joined to both ProductName and LocationDescription via left outer join:

select * from RfidEvent.win:time(30 sec) as rfid
      left outer join ProductName.win:keepall() as refprod
        on rfid.productId = refprod.prodId
      left outer join LocationDescription.win:keepall() as refdesc
        on rfid.location = refdesc.locId

If the optional on clause is specified, it may only employ the = equals operator and property names. Any other operators must be placed in the where-clause. The stream names that appear in the on clause may refer to any stream in the from-clause.

Your EPL may also provide no on clause. This is useful when the streams that are joined do not provide any properties to join on, for example when joining with a time-based pattern.

The next example employs a unidirectional left outer join such that the engine, every 10 seconds, outputs a count of the number of RfidEvent events in the 60-second time window.

select count(*) from
  pattern[every timer:interval(1)] unidirectional 
  left outer join
  RfidEvent.win:time(60 sec)

In a join or outer join your statement lists multiple event streams, views and/or patterns in the from clause. As events arrive into the engine, each of the streams (views, patterns) provides insert and remove stream events. The engine evaluates each insert and remove stream event provided by each stream, and joins or outer joins each event against data window contents of each stream, and thus generates insert and remove stream join results.

The direction of the join execution depends on which stream or streams are currently providing an insert or remove stream event for executing the join. A join is thus multidirectional, or bidirectional when only two streams are joined. A join can be made unidirectional if your application does not want new results when events arrive on a given stream or streams.

The unidirectional keyword can be used in the from clause to identify a single stream that provides the events to execute the join. If the keyword is present for a stream, all other streams in the from clause become passive streams. When events arrive or leave a data window of a passive stream then the join does not generate join results.

For example, consider a use case that requires us to join stock tick events (TickEvent) and news events (NewsEvent). The unidirectional keyword allows to generate results only when TickEvent events arrive, and not when NewsEvent arrive or leave the 10-second time window:

select * from TickEvent unidirectional, NewsEvent.win:time(10 sec) 
where tick.symbol = news.symbol

Aggregation functions in a unidirectional join aggregate within the context of each unidirectional event evaluation and are not cumulative. Thereby aggregation functions when used with unidirectional may evaluate faster as they do not need to consider a remove stream (data removed from data windows or named windows).

The count function in the next query returns, for each TickEvent, the number of matching NewEvent events:

select count(*) from TickEvent unidirectional, NewsEvent.win:time(10 sec) 
where tick.symbol = news.symbol

The following restrictions apply to unidirectional joins:

  1. The unidirectional keyword can only be specified for a single stream in the from clause.

  2. Receiving data from a unidirectional join via the pull API (iterator method) is not allowed. This is because the engine holds no state for the single stream that provides the events to execute the join.

  3. The stream that declares the unidirectional keyword cannot declare a data window view or other view for that stream, since remove stream events are not processed for the single stream.

When joining 3 or more streams (including any relational or non-relational sources as below) it can sometimes help to provide the query planner instructions how to best execute the join. The engine compiles a query plan for the EPL statement at statement creation time. You can output the query plan to logging (see configuration).

An outer join that specifies only inner keywords for all streams is equivalent to an default (inner) join. The following two statements are equivalent:

select * from TickEvent.std:lastevent(), 
    NewsEvent.std:lastevent() where tick.symbol = news.symbol

Equivalent to:

select * from TickEvent.std:lastevent() 
	inner join NewsEvent.std:lastevent() on tick.symbol = news.symbol

For all types of joins, the query planner determines a query graph: The term is used here for all the information regarding what properties or expressions are used to join the streams. The query graph thus includes the where-clause expressions as well as outer-join on-clauses if this statement is an outer join. The query planner also computes a dependency graph which includes information about all historical data streams (relational and non-relational as below) and their input needs.

For default (inner) joins the query planner first attempts to find a path of execution as a nested iteration. For each stream the query planner selects the best order of streams available for the nested iteration considering the query graph and dependency graph. If the full depth of the join is achievable via nested iteration for all streams without full table scan then the query planner uses that nested iteration plan. If not, then the query planner re-plans considering a merge join (Cartesian) approach instead.

Specify the @Hint('prefer_merge_join') to instruct the query planner to prefer a merge join plan instead of a nested iteration plan. Specify the @Hint('force_nested_iter') to instruct the query planner to always use a nested iteration plan.

For example, consider the below statement. Depending on the number of matching rows in OrderBookOne and OrderBookTwo (named windows in this example, and assumed to be defined elsewhere) the performance of the join may be better using the merge join plan.

@Hint('prefer_merge_join') 
select * from TickEvent.std:lastevent() t, 
	OrderBookOne ob1, OrderBookOne ob2
where ob1.symbol = t.symbol and ob2.symbol = t.symbol 
and ob1.price between t.buy and t.sell and ob2.price between t.buy and t.sell

For outer joins the query planner considers nested iteration and merge join (Cartesian) equally and above hints don't apply.

This chapter outlines how reference data and historical data that are stored in a relational database can be queried via SQL within EPL statements.

Esper can access via join and outer join as well as via iterator (poll) API all types of event streams to stored data. In order for such data sources to become accessible to Esper, some configuration is required. The Section 16.4.9, “Relational Database Access” explains the required configuration for database access in greater detail, and includes information on configuring a query result cache.

Esper does not parse or otherwise inspect your SQL query. Therefore your SQL can make use of any database-specific SQL language extensions or features that your database provides.

If you have enabled query result caching in your Esper database configuration, Esper retains SQL query results in cache following the configured cache eviction policy.

Also if you have enabled query result caching in your Esper database configuration and provide EPL where clause and/or on clause (outer join) expressions, then Esper builds indexes on the SQL query results to enable fast lookup. This is especially useful if your queries return a large number of rows. For building the proper indexes, Esper inspects the expression found in your EPL query where clause, if present. For outer joins, Esper also inspects your EPL query on clause. Esper analyzes the EPL on clause and where clause expressions, if present, looking for property comparison with or without logical AND-relationships between properties. When a SQL query returns rows for caching, Esper builds and caches the appropriate index and lookup strategies for fast row matching against indexes.

Joins or outer joins in which only SQL statements or method invocations are listed in the from clause and no other event streams are termed passive joins. A passive join does not produce an insert or remove stream and therefore does not invoke statement listeners with results. A passive join can be iterated on (polled) using a statement's safeIterator and iterator methods.

There are no restrictions to the number of SQL statements or types of streams joined. The following restrictions currently apply:

  • Sub-views on an SQL query are not allowed; That is, one cannot create a time or length window on an SQL query. However one can use the insert into syntax to make join results available to a further statement.

  • Your database software must support JDBC prepared statements that provide statement meta data at compilation time. Most major databases provide this function. A workaround is available for databases that do not provide this function.

  • JDBC drivers must support the getMetadata feature. A workaround is available as below for JDBC drivers that don't support getting metadata.

The next sections assume basic knowledge of SQL (Structured Query Language).

To join an event stream against stored data, specify the sql keyword followed by the name of the database and a parameterized SQL query. The syntax to use in the from clause of an EPL statement is:

sql:database_name [" parameterized_sql_query "]

The engine uses the database_name identifier to obtain configuration information in order to establish a database connection, as well as settings that control connection creation and removal. Please see Section 16.4.9, “Relational Database Access” to configure an engine for database access.

Following the database name is the SQL query to execute. The SQL query can contain one or more substitution parameters. The SQL query string is placed in single brackets [ and ]. The SQL query can be placed in either single quotes (') or double quotes ("). The SQL query grammer is passed to your database software unchanged, allowing you to write any SQL query syntax that your database understands, including stored procedure calls.

Substitution parameters in the SQL query string take the form ${expression}. The engine resolves expression at statement execution time to the actual expression result by evaluating the events in the joined event stream or current variable values, if any event property references or variables occur in the expression. An expression may not contain EPL substitution parameters.

The engine determines the type of the SQL query output columns by means of the result set metadata that your database software returns for the statement. The actual query results are obtained via the getObject on java.sql.ResultSet.

The sample EPL statement below joins an event stream consisting of CustomerCallEvent events with the results of an SQL query against the database named MyCustomerDB and table Customer:

select custId, cust_name from CustomerCallEvent,
  sql:MyCustomerDB [' select cust_name from Customer where cust_id = ${custId} ']

The example above assumes that CustomerCallEvent supplies an event property named custId. The SQL query selects the customer name from the Customer table. The where clause in the SQL matches the Customer table column cust_id with the value of custId in each CustomerCallEvent event. The engine executes the SQL query for each new CustomerCallEvent encountered.

If the SQL query returns no rows for a given customer id, the engine generates no output event. Else the engine generates one output event for each row returned by the SQL query. An outer join as described in the next section can be used to control whether the engine should generate output events even when the SQL query returns no rows.

The next example adds a time window of 30 seconds to the event stream CustomerCallEvent. It also renames the selected properties to customerName and customerId to demonstrate how the naming of columns in an SQL query can be used in the select clause in the EPL query. And the example uses explicit stream names via the as keyword.

select customerId, customerName from
  CustomerCallEvent.win:time(30 sec) as cce,
  sql:MyCustomerDB ["select cust_id as customerId, cust_name as customerName from Customer 
                  where cust_id = ${cce.custId}"] as cq

Any window, such as the time window, generates insert stream (istream) events as events enter the window, and remove stream (rstream) events as events leave the window. The engine executes the given SQL query for each CustomerCallEvent in both the insert stream and the remove stream. As a performance optimization, the istream or rstream keywords in the select clause can be used to instruct the engine to only join insert stream or remove stream events, reducing the number of SQL query executions.

Since any expression may be placed within the ${...} syntax, you may use variables or user-defined functions as well.

The next example assumes that a variable by name varLowerLimit is defined and that a user-defined function getLimit exists on the MyLib imported class that takes a LimitEvent as a parameter:

select * from LimitEvent le, 
  sql:MyCustomerDB [' select cust_name from Customer where 
      amount > ${max(varLowerLimit, MyLib.getLimit(le))} ']

The example above takes the higher of the current variable value or the value returned by the user-defined function to return only those customer names where the amount exceeds the computed limit.

Consider using the EPL where clause to join the SQL query result to your event stream. Similar to EPL joins and outer-joins that join event streams or patterns, the EPL where clause provides join criteria between the SQL query results and the event stream (as a side note, an SQL where clause is a filter of rows executed by your database on your database server before returning SQL query results).

Esper analyzes the expression in the EPL where clause and outer-join on clause, if present, and builds the appropriate indexes from that information at runtime, to ensure fast matching of event stream events to SQL query results, even if your SQL query returns a large number of rows. Your applications must ensure to configure a cache for your database using Esper configuration, as such indexes are held with regular data in a cache. If you application does not enable caching of SQL query results, the engine does not build indexes on cached data.

The sample EPL statement below joins an event stream consisting of OrderEvent events with the results of an SQL query against the database named MyRefDB and table SymbolReference:

select symbol, symbolDesc from OrderEvent as orders,
  sql:MyRefDB ['select symbolDesc from SymbolReference'] as reference
  where reference.symbol = orders.symbol

Notice how the EPL where clause joins the OrderEvent stream to the SymbolReference table. In this example, the SQL query itself does not have a SQL where clause and therefore returns all rows from table SymbolReference.

If your application enables caching, the SQL query fires only at the arrival of the first OrderEvent event. When the second OrderEvent arrives, the join execution uses the cached query result. If the caching policy that you specified in the Esper database configuration evicts the SQL query result from cache, then the engine fires the SQL query again to obtain a new result and places the result in cache.

If SQL result caching is enabled and your EPL where clause, as show in the above example, provides the properties to join, then the engine indexes the SQL query results in cache and retains the index together with the query result in cache. Thus your application can benefit from high performance index-based lookups as long as the SQL query results are found in cache.

The SQL result caches operate on the level of all result rows for a given parameter set. For example, if your query returns 10 rows for a certain set of parameter values then the cache treats all 10 rows as a single entry keyed by the parameter values, and the expiry policy applies to all 10 rows and not to each individual row.

It is also possible to join multiple autonomous database systems in a single query, for example:

select symbol, symbolDesc from OrderEvent as orders,
  sql:My_Oracle_DB ['select symbolDesc from SymbolReference'] as reference,
  sql:My_MySQL_DB ['select orderList from orderHistory'] as history
  where reference.symbol = orders.symbol
  and history.symbol = orders.symbol 

Certain JDBC database drivers are known to not return metadata for precompiled prepared SQL statements. This can be a problem as metadata is required by Esper. Esper obtains SQL result set metadata to validate an EPL statement and to provide column types for output events. JDBC drivers that do not provide metadata for precompiled SQL statements require a workaround. Such drivers do generally provide metadata for executed SQL statements, however do not provide the metadata for precompiled SQL statements.

Please consult the Chapter 16, Configuration for the configuration options available in relation to metadata retrieval.

To obtain metadata for an SQL statement, Esper can alternatively fire a SQL statement which returns the same column names and types as the actual SQL statement but without returning any rows. This kind of SQL statement is referred to as a sample statement in below workaround description. The engine can then use the sample SQL statement to retrieve metadata for the column names and types returned by the actual SQL statement.

Applications can provide a sample SQL statement to retrieve metadata via the metadatasql keyword:

sql:database_name ["parameterized_sql_query" metadatasql "sql_meta_query"] 

The sql_meta_query must be an SQL statement that returns the same number of columns, the same type of columns and the same column names as the parameterized_sql_query, and does not return any rows.

Alternatively, applications can choose not to provide an explicit sample SQL statement. If the EPL statement does not use the metadatasql syntax, the engine applies lexical analysis to the SQL statement. From the lexical analysis Esper generates a sample SQL statement adding a restrictive clause "where 1=0" to the SQL statement.

Alternatively, you can add the following tag to the SQL statement: ${$ESPER-SAMPLE-WHERE}. If the tag exists in the SQL statement, the engine does not perform lexical analysis and simply replaces the tag with the SQL where clause "where 1=0". Therefore this workaround is applicable to SQL statements that cannot be correctly lexically analyzed. The SQL text after the placeholder is not part of the sample query. For example:

select mycol from sql:myDB [
  'select mycol from mytesttable ${$ESPER-SAMPLE-WHERE} where ....'], ...

If your parameterized_sql_query SQL query contains vendor-specific SQL syntax, generation of the metadata query may fail to produce a valid SQL statement. If you experience an SQL error while fetching metadata, use any of the above workarounds with the Oracle JDBC driver.

Your application may need to join data that originates from a web service, a distributed cache, an object-oriented database or simply data held in memory by your application. One way to join in external data is by means of method invocation (or procedure call or function) in the from clause of a statement.

The results of such a method invocation in the from clause plays the same role as a relational database table in an inner and outer join in SQL. The role is thus dissimilar to the role of a user-defined function, which may occur in any expression such as in the select clause or the where clause. Both are backed by one or more static methods provided by your class library.

Esper can join and outer join an unlimited number and all types of event streams to the data returned by your method invocation. In addition, Esper can be configured to cache the data returned by your method invocations.

Joins or outer joins in which only SQL statements or method invocations are listed in the from clause and no other event streams are termed passive joins. A passive join does not produce an insert or remove stream and therefore does not invoke statement listeners with results. A passive join can be iterated on (polled) using a statement's safeIterator and iterator methods.

The following restrictions currently apply:

  • Sub-views on a method invocations are not allowed; That is, one cannot create a time or length window on a method invocation. However one can use the insert into syntax to make join results available to a further statement.

The syntax for a method invocation in the from clause of an EPL statement is:

method:class_or_variable_name.method_name[(parameter_expressions)]

The method keyword denotes a method invocation. It is followed by a class or variable name and a method name separated by a dot (.) character. If you have parameters to your method invocation, these are placed in parentheses after the method name. Any expression is allowed as a parameter, and individual parameter expressions are separated by a comma. Expressions may also use event properties of the joined stream.

In the sample join statement shown next, the method lookupAsset provided by class (or variable) MyLookupLib returns one or more rows based on the asset id (a property of the AssetMoveEvent) that is passed to the method:

select * from AssetMoveEvent, method:MyLookupLib.lookupAsset(assetId)

The following statement demonstrates the use of the where clause to join events to the rows returned by a method invocation, which in this example does not take parameters:

select assetId, assetDesc from AssetMoveEvent as asset, 
       method:MyLookupLib.getAssetDescriptions() as desc 
where asset.assetid = desc.assetid

Your method invocation may return zero, one or many rows for each method invocation. If you have caching enabled through configuration, then Esper can avoid the method invocation and instead use cached results. Similar to SQL joins, Esper also indexes cached result rows such that join operations based on the where clause or outer-join on clause can be very efficient, especially if your method invocation returns a large number of rows.

If the time taken by method invocations is critical to your application, you may configure local caches as Section 16.4.7, “Cache Settings for From-Clause Method Invocations” describes.

Esper analyzes the expression in the EPL where clause and outer-join on clause, if present, and builds the appropriate indexes from that information at runtime, to ensure fast matching of event stream events to method invocation results, even if your method invocation returns a large number of rows. Your applications must ensure to configure a cache for your method invocation using Esper configuration, as such indexes are held with regular data in a cache. If you application does not enable caching of method invocation results, the engine does not build indexes on cached data.

You application can provide a public static method or can provide an instance method of an existing object. The method must accept the same number and type of parameters as listed in the parameter expression list.

The examples herein mostly use public static methods. For a detail description of instance methods please see Section 5.17.5, “Class and Event-Type Variables” and below example.

If your method invocation returns either no row or only one row, then the return type of the method can be a Java class, java.util.Map or Object[] (object-array). If your method invocation can return more then one row, then the return type of the method must be an array of Java class, array of Map, Object[][] (object-array 2-dimensional) or Collection or Iterator (or subtypes thereof).

If you are using a Java class, an array of Java class or a Collection<Class> or an Iterator<Class> as the return type, then the class must adhere to JavaBean conventions: it must expose properties through getter methods.

If you are using java.util.Map or an array of Map or a Collection<Map> or an Iterator<Map> as the return type, please note the following:

  • Your application must provide a second method that returns event property metadata, as the next section outlines.
  • Each map instance returned by your method should have String-type keys and object values (Map<String, Object>).

If you are using Object[] (object-array) or Object[][] (object-array 2-dimensional) or Collection<Object[]> or Iterator<Object[]> as the return type, please note the following:

  • Your application must provide a second method that returns event property metadata, as the next section outlines.
  • Each object-array instance returned by your method should have the exact same array position for values as the property metadata indicates and the array length must be the same as the number of properties.

Your application method must return either of the following:

  1. A null value or an empty array to indicate an empty result (no rows).

  2. A Java object or Map or Object[] to indicate a zero (null) or one-row result.

  3. Return multiple result rows by returning either:

    • An array of Java objects.
    • An array of Map instances.
    • An array of Object[] instances.
    • A Collection of Java objects.
    • A Collection of Map instances.
    • A Collection of Object[] instances.
    • An Iterator of Java objects.
    • An Iterator of Map instances.
    • An Iterator of Object[] instances.

As an example, consider the method 'getAssetDescriptions' provided by class 'MyLookupLib' as discussed earlier:

select assetId, assetDesc from AssetMoveEvent as asset,
       method:com.mypackage.MyLookupLib.getAssetDescriptions() as desc 
  where asset.assetid = desc.assetid

The 'getAssetDescriptions' method may return multiple rows and is therefore declared to return an array of the class 'AssetDesc'. The class AssetDesc is a POJO class (not shown here):

public class MyLookupLib {
  ...
  public static AssetDesc[] getAssetDescriptions() {
    ...
    return new AssetDesc[] {...};
  }

The example above specifies the full Java class name of the class 'MyLookupLib' class in the EPL statement. The package name does not need to be part of the EPL if your application imports the package using the auto-import configuration through the API or XML, as outlined in Section 16.4.6, “Class and package imports”.

Alternatively the example above could return a Collection wherein the method declares as public static Collection<AssetDesc> getAssetDescriptions() {...} or an Iterator wherein the method declares as public static Iterator<AssetDesc> getAssetDescriptions() {...}.

If you application has an existing object instance such as a service or a dependency injected bean then it must make the instance available as a variable. Please see Section 5.17.5, “Class and Event-Type Variables” for more information.

For example, assuming you provided a stateChecker variable that points to an object instance that provides a public getMatchingAssets instance method and that returns property assetIds, you may use the state checker service in the from-clause as follows:

select assetIds from AssetMoveEvent, method:stateChecker.getMatchingAssets(assetDesc)

Your application may return java.util.Map or an array of Map from method invocations. If doing so, your application must provide metadata about each row: it must declare the property name and property type of each Map entry of a row. This information allows the engine to perform type checking of expressions used within the statement.

You declare the property names and types of each row by providing a method that returns property metadata. The metadata method must follow these conventions:

In the following example, a class 'MyLookupLib' provides a method to return historical data based on asset id and asset code:

select assetId, location, x_coord, y_coord from AssetMoveEvent as asset,
       method:com.mypackage.MyLookupLib.getAssetHistory(assetId, assetCode) as history

A sample implementation of the class 'MyLookupLib' is shown below.

public class MyLookupLib {
  ...
  // For each column in a row, provide the property name and type
  //
  public static Map<String, Class> getAssetHistoryMetadata() {
    Map<String, Class> propertyNames = new HashMap<String, Class>();
    propertyNames.put("location", String.class);
    propertyNames.put("x_coord", Integer.class);
    propertyNames.put("y_coord", Integer.class);
    return propertyNames;
  }
... 
  // Lookup rows based on assetId and assetCode
  // 
  public static Map<String, Object>[] getAssetHistory(String assetId, String assetCode) {
    Map rows = new Map[2];	// this sample returns 2 rows
    for (int i = 0; i < 2; i++) {
      rows[i] = new HashMap();
      rows[i].put("location", "somevalue");
      rows[i].put("x_coord", 100);
      // ... set more values for each row
    }
    return rows;
  }

In the example above, the 'getAssetHistoryMetadata' method provides the property metadata: the names and types of properties in each row. The engine calls this method once per statement to determine event typing information.

The 'getAssetHistory' method returns an array of Map objects that are two rows. The implementation shown above is a simple example. The parameters to the method are the assetId and assetCode properties of the AssetMoveEvent joined to the method. The engine calls this method for each insert and remove stream event in AssetMoveEvent.

To indicate that no rows are found in a join, your application method may return either a null value or an array of size zero.

Alternatively the example above could return a Collection wherein the method declares as public static Collection<Map> getAssetHistory() {...} or an Iterator wherein the method declares as public static Iterator<Map> getAssetHistory() {...}.

Your application may return Object[] (object-array) or an array of Object[] (object-array 2-dimensional) from method invocations. If doing so, your application must provide metadata about each row: it must declare the property name and property type of each array entry of a row in the exact same order as provided by value rows. This information allows the engine to perform type checking of expressions used within the statement.

You declare the property names and types of each row by providing a method that returns property metadata. The metadata method must follow these conventions:

In the following example, a class 'MyLookupLib' provides a method to return historical data based on asset id and asset code:

select assetId, location, x_coord, y_coord from AssetMoveEvent as asset,
       method:com.mypackage.MyLookupLib.getAssetHistory(assetId, assetCode) as history

A sample implementation of the class 'MyLookupLib' is shown below.

public class MyLookupLib {
  ...
  // For each column in a row, provide the property name and type
  //
  public static LinkedHashMap<String, Class> getAssetHistoryMetadata() {
    LinkedHashMap<String, Class> propertyNames = new LinkedHashMap<String, Class>();
    propertyNames.put("location", String.class);
    propertyNames.put("x_coord", Integer.class);
    propertyNames.put("y_coord", Integer.class);
    return propertyNames;
  }
... 
  // Lookup rows based on assetId and assetCode
  // 
  public static Object[][] getAssetHistory(String assetId, String assetCode) {
    Object[][] rows = new Object[5][];	// this sample returns 5 rows
    for (int i = 0; i < 5; i++) {
      rows[i] = new Object[2]; // single row has 2 fields
      rows[i][0]  = "somevalue";
      rows[i][1] = 100;
      // ... set more values for each row
    }
    return rows;
  }

In the example above, the 'getAssetHistoryMetadata' method provides the property metadata: the names and types of properties in each row. The engine calls this method once per statement to determine event typing information.

The 'getAssetHistory' method returns an Object[][] that represents five rows. The implementation shown above is a simple example. The parameters to the method are the assetId and assetCode properties of the AssetMoveEvent joined to the method. The engine calls this method for each insert and remove stream event in AssetMoveEvent.

To indicate that no rows are found in a join, your application method may return either a null value or an array of size zero.

Alternatively the example above could return a Collection wherein the method declares as public static Collection<Object[]> getAssetHistory() {...} or an Iterator wherein the method declares as public static Iterator<Object[]> getAssetHistory() {...}.

EPL allows declaring an event type via the create schema clause and also by means of the static or runtime configuration API addEventType functions. The term schema and event type has the same meaning in EPL.

Your application can declare an event type by providing the property names and types or by providing a class name. Your application may also declare a variant stream schema.

When using the create schema syntax to declare an event type, the engine automatically removes the event type when there are no started statements referencing the event type, including the statement that declared the event type. When using the configuration API, the event type stays cached even if there are no statements that refer to the event type and until explicitly removed via the runtime configuration API.

The synopsis of the create schema syntax providing property names and types is:

create [map | objectarray] schema schema_name [as] 
    (property_name property_type [,property_name property_type [,...])
  [inherits inherited_event_type[, inherited_event_type] [,...]]
  [starttimestamp timestamp_property_name]
  [endtimestamp timestamp_property_name]
  [copyfrom copy_type_name [, copy_type_name] [,...]]

The create keyword can be followed by map to instruct the engine to represent events of that type by the Map event representation, or objectarray to denote an Object-array event type. If neither the map or objectarray keywords are provided, the engine-wide default event representation applies.

After create schema follows a schema_name. The schema name is the event type name.

The property_name is an identifier providing the event property name. The property_type is also required for each property. Valid property types are listed in Section 5.17.1, “Creating Variables: the Create Variable clause” and in addition include:

  1. Any Java class name, fully-qualified or the simple class name if imports are configured.

  2. Add left and right square brackets [] to any type to denote an array-type event property.

  3. Use an event type name as a property type.

  4. The null keyword for a null-typed property.

The optional inherits keywords is followed by a comma-separated list of event type names that are the supertypes to the declared type.

The optional starttimestamp keyword is followed by a property name. Use this to tell the engine that your event has a timestamp. The engine checks that the property name exists on the declared type and returns a date-time value. Declare a timestamp property if you want your events to implicitly carry a timestamp value for convenient use with interval algebra methods as a start timestamp.

The optional endtimestamp keyword is followed by a property name. Use this together with starttimestamp to tell the engine that your event has a duration. The engine checks that the property name exists on the declared type and returns a date-time value. Declare an endtimestamp property if you want your events to implicitly carry a duration value for convenient use with interval algebra methods.

The optional copyfrom keyword is followed by a comma-separate list of event type names. For each event type listed, the engine looks up that type and adds all event property definitions to the newly-defined type, in addition to those listed explicitly (if any).

A few example event type declarations follow:

// Declare type SecurityEvent
create schema SecurityEvent as (ipAddress string, userId String, numAttempts int)
			
// Declare type AuthorizationEvent with the roles property being an array of String 
// and the hostinfo property being a POJO object
create schema AuthorizationEvent(group String, roles String[], hostinfo com.mycompany.HostNameInfo)

// Declare type CompositeEvent in which the innerEvents property is an array of SecurityEvent
create schema CompositeEvent(group String, innerEvents SecurityEvent[])

// Declare type WebPageVisitEvent that inherits all properties from PageHitEvent
create schema WebPageVisitEvent(userId String) inherits PageHitEvent

// Declare a type with start and end timestamp (i.e. event with duration).
create schema RoboticArmMovement (robotId string, startts long, endts long) 
  starttimestamp startts endtimestamp endts
  
// Create a type that has all properties of SecurityEvent plus a userName property
create schema ExtendedSecurityEvent (userName string) copyfrom SecurityEvent

// Create a type that has all properties of SecurityEvent 
create schema SimilarSecurityEvent () copyfrom SecurityEvent

// Create a type that has all properties of SecurityEvent and WebPageVisitEvent plus a userName property
create schema WebSecurityEvent (userName string) copyfrom SecurityEvent, WebPageVisitEvent

To elaborate on the inherits keyword, consider the following two schema definitions:

create schema Foo as (string prop1)
create schema Bar() inherits Foo

Following above schema, Foo is a supertype or Bar and therefore any Bar event also fulfills Foo and matches where Foo matches. An EPL statement such as select * from Foo returns any Foo event as well as any event that is a subtype of Foo such as all Bar events. When your EPL queries don't use any Foo events there is no cost, thus inherits is generally an effective way to share properties between types. The start and end timestamp are also inherited from any supertype that has the timestamp property names defined.

The optional copyfrom keyword is for defining a schema based on another schema. This keyword causes the engine to copy property definitions: There is no inherits, extends, supertype or subtype relationship between the types listed.

To define an event type Bar that has the same properties as Foo:

create schema Foo as (string prop1)
create schema Bar() copyfrom Foo

To define an event type Bar that has the same properties as Foo and that adds its own property prop2:

create schema Foo as (string prop1)
create schema Bar(string prop2) copyfrom Foo

If neither the map or objectarray keywords are provided, and if the create-schema statement provides the @EventRepresentation(array=true) annotation the engine expects object array events. If the statement provides the @EventRepresentation(array=false) annotation the engine expects Map objects as events. If neither annotation is provided, the engine uses the configured default event representation as discussed in Section 16.4.11.1, “Default Event Representation”.

The following two EPL statements both instructs the engine to represent Foo events as object arrays. When sending Foo events into the engine use the sendEvent(Object[] data, String typeName) footprint.

create objectarray schema Foo as (string prop1)
@EventRepresentation(array=true) create schema Foo as (string prop1)

The next two EPL statements both instructs the engine to represent Foo events as Maps. When sending Foo events into the engine use the sendEvent(Map data, String typeName) footprint.

create map schema Foo as (string prop1)
@EventRepresentation(array=false) create schema Foo as (string prop1)

A variant stream is a predefined stream into which events of multiple disparate event types can be inserted. Please see Section 5.10.3, “Merging Disparate Types of Events: Variant Streams” for rules regarding property visibility and additional information.

The synopsis is:

create variant schema schema_name [as] eventtype_name|* [, eventtype_name|*] [,...]

Provide the variant keyword to declare a variant stream.

The '*' wildcard character declares a variant stream that accepts any type of event inserted into the variant stream.

Provide eventtype_name if the variant stream should hold events of the given type only. When using insert into to insert into the variant stream the engine checks to ensure the inserted event type or its supertypes match the required event type.

A few examples are shown below:

// Create a variant stream that accepts only LoginEvent and LogoutEvent event types
create variant schema SecurityVariant as LoginEvent, LogoutEvent

// Create a variant stream that accepts any event type
create variant schema AnyEvent as *

EPL offers a convenient syntax to splitting, routing or duplicating events into multiple streams, and for receiving unmatched events among a set of filter criteria.

For splitting a single event that acts as a container and expose child events as a property of itself consider the contained-event syntax as described in Section 5.19, “Contained-Event Selection”.

You may define a triggering event or pattern in the on-part of the statement followed by multiple insert into, select and where clauses.

The synopsis is:

[context context_name]
on event_type[(filter_criteria)] [as stream_name]
insert into insert_into_def select select_list [where condition]
[insert into insert_into_def select select_list [where condition]]
[insert into...]
[output first | all]

The event_type is the name of the type of events that trigger the split stream. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as keyword can be used to assign a stream name. Patterns and named windows can also be specified in the on clause.

Following the on-clause is one or more insert into clauses as described in Section 5.10, “Merging Streams and Continuous Insertion: the Insert Into Clause” and select clauses as described in Section 5.3, “Choosing Event Properties And Events: the Select Clause”.

Each select clause may be followed by a where clause containing a condition. If the condition is true for the event, the engine transforms the event according to the select clause and inserts it into the corresponding stream.

At the end of the statement can be an optional output clause. By default the engine inserts into the first stream for which the where clause condition matches if one was specified, starting from the top. If you specify the output all keywords, then the engine inserts into each stream (not only the first stream) for which the where clause condition matches or that do not have a where clause.

If, for a given event, none of the where clause conditions match, the statement listener receives the unmatched event. The statement listener only receives unmatched events and does not receive any transformed or inserted events. The iterator method to the statement returns no events.

You may specify an optional context name to the effect that the split-stream operates according to the context dimensional information as declared for the context. See Chapter 4, Context and Context Partitions for more information.

In the below sample statement, the engine inserts each OrderEvent into the LargeOrders stream if the order quantity is 100 or larger, or into the SmallOrders stream if the order quantity is smaller then 100:

on OrderEvent 
  insert into LargeOrders select * where orderQty >= 100
  insert into SmallOrders select *

The next example statement adds a new stream for medium-sized orders. The new stream receives orders that have an order quantity between 20 and 100:

on OrderEvent 
  insert into LargeOrders select orderId, customer where orderQty >= 100
  insert into MediumOrders select orderId, customer where orderQty between 20 and 100
  insert into SmallOrders select orderId, customer where orderQty > 0

As you may have noticed in the above statement, orders that have an order quantity of zero don't match any of the conditions. The engine does not insert such order events into any stream and the listener to the statement receives these unmatched events.

By default the engine inserts into the first insert into stream without a where clause or for which the where clause condition matches. To change the default behavior and insert into all matching streams instead (including those without a where clause), the output all keywords may be added to the statement.

The sample statement below shows the use of the output all keywords. The statement populates both the LargeOrders stream with large orders as well as the VIPCustomerOrders stream with orders for certain customers based on customer id:

on OrderEvent 
  insert into LargeOrders select * where orderQty >= 100
  insert into VIPCustomerOrders select * where customerId in (1001, 1002)
  output all

Since the output all keywords are present, the above statement inserts each order event into either both streams or only one stream or none of the streams, depending on order quantity and customer id of the order event. The statement delivers order events not inserted into any of the streams to the listeners and/or subscriber to the statement.

The following limitations apply to split-stream statements:

  1. Aggregation functions and the prev and prior operators are not available in conditions and the select-clause.

A variable is a scalar, object, event or set of aggregation values that is available for use in all statements including patterns. Variables can be used in an expression anywhere in a statement as well as in the output clause for output rate limiting.

Variables must first be declared or configured before use, by defining each variable's type and name. Variables can be created via the create variable syntax or declared by runtime or static configuration. Variables can be assigned new values by using the on set syntax or via the setVariableValue methods on EPRuntime. The EPRuntime also provides method to read variable values.

A variable can be declared constant. A constant variable always has the initial value and cannot be assigned a new value. A constant variable can be used like any other variable and can be used wherever a constant is required. By declaring a variable constant you enable the Esper engine to optimize and perform query planning knowing that the variable value cannot change.

When declaring a class-typed, event-typed or aggregation-typed variable you may read or set individual properties within the same variable.

The engine guarantees consistency and atomicity of variable reads and writes on the level of context partition (this is a soft guarantee, see below). Variables are optimized for fast read access and are also multithread-safe.

When you associate a context to the variable then each context partition maintains its own variable value. See Section 4.8, “Context and Variables” for more information.

Variables can also be removed, at runtime, by destroying all referencing statements including the statement that created the variable, or by means of the runtime configuration API.

The create variable syntax creates a new variable by defining the variable type and name. In alternative to the syntax, variables can also be declared in the runtime and engine configuration options.

The synopsis for creating a variable is as follows:

create [constant] variable variable_type [[]] variable_name [ = assignment_expression ]

Specify the optional constant keyword when the variable is a constant whose associated value cannot be altered. Your EPL design should prefer constant variables over non-constant variables.

The variable_type can be any of the following:

variable_type
	:  string
	|  char 
	|  character
	|  bool 
	|  boolean
	|  byte
	|  short 
	|  int 
	|  integer 
	|  long 
	|  double
	|  float
	|  object
	|  enum_class
	|  class_name
	|  event_type_name

Variable types can accept null values. The object type is for an untyped variable that can be assigned any value. You can provide a class name (use imports) or a fully-qualified class name to declare a variable of that Java class type including an enumeration class. You can also supply the name of an event type to declare a variable that holds an event of that type.

Append [] to the variable type to declare an array variable. A limitation is that if your variable type is an event type then array is not allowed (applies to variables only and not to named windows or tables). For arrays of primitives, specify [primitive], for example int[primitive].

The variable_name is an identifier that names the variable. The variable name should not already be in use by another variable.

The assignment_expression is optional. Without an assignment expression the initial value for the variable is null. If present, it supplies the initial value for the variable.

The EPStatement object of the create variable statement provides access to variable values. The pull API methods iterator and safeIterator return the current variable value. Listeners to the create variable statement subscribe to changes in variable value: the engine posts new and old value of the variable to all listeners when the variable value is updated by an on set statement.

The example below creates a variable that provides a threshold value. The name of the variable is var_threshold and its type is long. The variable's initial value is null as no other value has been assigned:

create variable long var_threshold

This statement creates an integer-type variable named var_output_rate and initializes it to the value ten (10):

create variable integer var_output_rate = 10

The next statement declares a constant string-type variable:

create constant variable string const_filter_symbol = 'GE'

In addition to creating a variable via the create variable syntax, the runtime and engine configuration API also allows adding variables. The next code snippet illustrates the use of the runtime configuration API to create a string-typed variable:

epService.getEPAdministrator().getConfiguration()
  .addVariable("myVar", String.class, "init value");

The following example declares a constant that is an array of string:

create constant variable string[] const_filters = {'GE', 'MSFT'}

The next example declares a constant that is an array of enumeration values. It assumes the Color enumeration class was imported:

create constant variable Color[] const_colors = {Color.RED, Color.BLUE}

For an array of primitive-type bytes, specify the primitive keyword in square brackets, as the next example shows:

create variable byte[primitive] mybytes = SomeClass.getBytes()

Use the new keyword to initialize object instances (the example assumes the package or class was imported):

create constant variable AtomicInteger cnt = new AtomicInteger(1)

The engine removes the variable if the statement that created the variable is destroyed and all statements that reference the variable are also destroyed. The getVariableNameUsedBy and the removeVariable methods, both part of the runtime ConfigurationOperations API, provide use information and can remove a variable. If the variable was added via configuration, it can only be removed via the configuration API.

The on set statement assigns a new value to one or more variables when a triggering event arrives or a triggering pattern occurs. Use the setVariableValue methods on EPRuntime to assign variable values programmatically.

The synopsis for setting variable values is:

on event_type[(filter_criteria)] [as stream_name]
  set variable_name = expression [, variable_name = expression [,...]]

The event_type is the name of the type of events that trigger the variable assignments. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as keyword can be used to assign an stream name. Patterns and named windows can also be specified in the on clause.

The comma-separated list of variable names and expressions set the value of one or more variables. Subqueries may by part of expressions however aggregation functions and the prev or prior function may not be used in expressions.

All new variable values are applied atomically: the changes to variable values by the on set statement become visible to other statements all at the same time. No changes are visible to other processing threads until the on set statement completed processing, and at that time all changes become visible at once.

The EPStatement object provides access to variable values. The pull API methods iterator and safeIterator return the current variable values for each of the variables set by the statement. Listeners to the statement subscribe to changes in variable values: the engine posts new variable values of all variables to any listeners.

In the following example, a variable by name var_output_rate has been declared previously. When a NewOutputRateEvent event arrives, the variable is updated to a new value supplied by the event property 'rate':

on NewOutputRateEvent set var_output_rate = rate

The next example shows two variables that are updated when a ThresholdUpdateEvent arrives:

on ThresholdUpdateEvent as t 
  set var_threshold_lower = t.lower,
      var_threshold_higher = t.higher

The sample statement shown next counts the number of pattern matches using a variable. The pattern looks for OrderEvent events that are followed by CancelEvent events for the same order id within 10 seconds of the OrderEvent:

on pattern[every a=OrderEvent -> (CancelEvent(orderId=a.orderId) where timer:within(10 sec))]
  set var_counter = var_counter + 1

A variable name can be used in any expression and can also occur in an output rate limiting clause. This section presents examples and discusses performance, consistency and atomicity attributes of variables.

The next statement assumes that a variable named 'var_threshold' was created to hold a total price threshold value. The statement outputs an event when the total price for a symbol is greater then the current threshold value:

select symbol, sum(price) from TickEvent 
group by symbol 
having sum(price) > var_threshold

In this example we use a variable to dynamically change the output rate on-the-fly. The variable 'var_output_rate' holds the current rate at which the statement posts a current count to listeners:

select count(*) from TickEvent output every var_output_rate seconds

Variables are optimized towards high read frequency and lower write frequency. Variable reads do not incur locking overhead (99% of the time) while variable writes do incur locking overhead.

The engine softly guarantees consistency and atomicity of variables when your statement executes in response to an event or timer invocation. Variables acquire a stable value (implemented by versioning) when your statement starts executing in response to an event or timer invocation, and variables do not change value during execution. When one or more variable values are updated via on set statements, the changes to all updated variables become visible to statements as one unit and only when the on set statement completes successfully.

The atomicity and consistency guarantee is a soft guarantee. If any of your application statements, in response to an event or timer invocation, execute for a time interval longer then 15 seconds (default interval length), then the engine may use current variable values after 15 seconds passed, rather then then-current variable values at the time the statement started executing in response to an event or timer invocation.

The length of the time interval that variable values are held stable for the duration of execution of a given statement is by default 15 seconds, but can be configured via engine default settings.

The create variable syntax and the API accept a fully-qualified class name or alternatively the name of an event type. This is useful when you want a single variable to have multiple property values to read or set.

The next statement assumes that the event type PageHitEvent is declared:

create variable PageHitEvent varPageHitZero

These example statements show two ways of assigning to the variable:

// You may assign the complete event
on PageHitEvent(ip='0.0.0.0') pagehit set varPageHitZero = pagehit
// Or assign individual properties of the event
on PageHitEvent(ip='0.0.0.0') pagehit set varPageHitZero.userId = pagehit.userId

Similarly statements may use properties of class or event-type variables as this example shows:

select * from FirewallEvent(userId=varPageHitZero.userId)

Instance method can also be invoked:

create variable com.example.StateCheckerService stateChecker
select * from TestEvent as e where stateChecker.checkState(e)

A variable that represents a service for calling instance methods could be initialized by calling a factory method. This example assumes the classes were added to imports:

create constant variable StateCheckerService stateChecker = StateCheckerServiceFactory.makeService()

Or the variable can be added via the config API; an example code snippet is next:

admin.getConfiguration().addVariable("stateChecker", StateCheckerService.class, StateCheckerServiceFactory.makeService(), true);

Your application can declare an expression or script using the create expression clause. Such expressions or scripts become available globally to any EPL statement.

The synopsis of the create expression syntax is:

create expression expression_or_script

Use the create expression keywords and append the expression or scripts.

At the time your application creates the create expression statement the expression or script becomes globally visible.

At the time your application destroys the create expression statement the expression or script are no longer visible. Existing statements that use the global expression or script are unaffected.

Expression aliases are the simplest means of sharing expressions and do not accept parameters. Expression declarations limit the expression scope to the parameters that are passed.

The syntax and additional examples for declaring an expression is outlined in Section 5.2.8, “Expression Alias”, which discusses expression aliases that are visible within the same EPL statement i.e. visible locally only.

When using the create expression syntax to declare an expression the engine remembers the expression alias and expression and allows the alias to be referenced in all other EPL statements.

The below EPL declares a globally visible expression alias for an expression that computes the total of the mid-price which is the buy and sell price divided by two:

create expression totalMidPrice alias for { sum((buy + sell) / 2) }

The next EPL returns mid-price for events for which the mid-price per symbol stays below 10:

select symbol, midPrice from MarketDataEvent group by symbol having midPrice < 10

The expression name must be unique among all other expression aliases and expression declarations.

Your application can provide an expression alias of the same name local to a given EPL statement as well as globally using create expression. The locally-provided expression alias overrides the global expression alias.

The engine validates global expression aliases at the time your application creates a statement that references the alias. When a statement references a global alias, the engine uses the that statement's local expression scope to validate the expression. Expression aliases can therefore be dynamically typed and type information does not need to be the same for all statements that reference the expression alias.

The syntax and additional examples for declaring an expression is outlined in Section 5.2.9, “Expression Declaration”, which discusses declaring expressions that are visible within the same EPL statement i.e. visible locally only.

When using the create expression syntax to declare an expression the engine remembers the expression and allows the expression to be referenced in all other EPL statements.

The below EPL declares a globally visible expression that computes a mid-price and that requires a single parameter:

create expression midPrice { in => (buy + sell) / 2 }

The next EPL returns mid-price for each event:

select midPrice(md) from MarketDataEvent as md

The expression name must be unique for global expressions. It is not possible to declare the same global expression twice with the same name.

Your application can declare an expression of the same name local to a given EPL statement as well as globally using create expression. The locally-declared expression overrides the globally declared expression.

The engine validates globally declared expressions at the time your application creates a statement that references the global expression. When a statement references a global expression, the engine uses that statement's type information to validate the global expressions. Global expressions can therefore be dynamically typed and type information does not need to be the same for all statements that reference the global expression.

This example shows a sequence of EPL, that can be created in the order shown, and that demonstrates expression validation at time of referral:

create expression minPrice {(select min(price) from OrderWindow)}
create window OrderWindow.win:time(30) as OrderEvent
insert into OrderWindow select * from OrderEvent
// Validates and incorporates the declared global expression
select minPrice() as minprice from MarketData

The syntax and additional examples for declaring scripts is outlined in Chapter 19, Script Support, which discusses declaring scripts that are visible within the same EPL statement i.e. visible locally only.

When using the create expression syntax to declare a script the engine remembers the script and allows the script to be referenced in all other EPL statements.

The below EPL declares a globally visible script in the JavaScript dialect that computes a mid-price:

create expression midPrice(buy, sell) [ (buy + sell) / 2 ]

The next EPL returns mid-price for each event:

select midPrice(buy, sell) from MarketDataEvent

The engine validates globally declared scripts at the time your application creates a statement that references the global script. When a statement references a global script, the engine uses that statement's type information to determine parameter types. Global scripts can therefore be dynamically typed and type information does not need to be the same for all statements that reference the global script.

The script name in combination with the number of parameters must be unique for global scripts. It is not possible to declare the same global script twice with the same name and number of parameters.

Your application can declare a script of the same name and number of parameters that is local to a given EPL statement as well as globally using create expression. The locally-declared script overrides the globally declared script.

Contained-event selection is for use when an event contains properties that are themselves events, or more generally when your application needs to split an event into multiple events. One example is when application events are coarse-grained structures and you need to perform bulk operations on the rows of the property graph in an event.

Use the contained-event selection syntax in a filter expression such as in a pattern, from clause, subselect, on-select and on-delete. This section provides the synopsis and examples.

To review, in the from clause a contained_selection may appear after the event stream name and filter criteria, and before any view specifications.

The synopsis for contained_selection is as follows:

[select select_expressions from] 
  contained_expression [@type(eventtype_name)] [as alias_name]
  [where filter_expression]

The select clause and select_expressions are optional and may be used to select specific properties of contained events.

The contained_expression is required and returns individual events. The expression can, for example, be an event property name that returns an event fragment, i.e. a property that can itself be represented as an event by the underlying event representation. The expression can also be any other expression such as a single-row function or a script that returns either an array or a java.util.Collection of events. Simple values such as integer or string are not fragments but can be used as well as described in Section 5.19.6, “Arrays returned by a Contained Expression”.

Provide the @type(name) annotation after the contained expression to name the event type of events returned by the expression. The annotation is optional and not needed when the contained-expression is an event property that returns a class or other event fragment.

The alias_name can be provided to assign a name to the expression result value rows.

The where clause and filter_expression is optional and may be used to filter out properties.

As an example event, consider a media order. A media order consists of order items as well as product descriptions. A media order event can be represented as an object graph (POJO event representation), or a structure of nested Maps (Map event representation) or a XML document (XML DOM or Axiom event representation) or other custom plug-in event representation.

To illustrate, a sample media order event in XML event representation is shown below. Also, a XML event type can optionally be strongly-typed with an explicit XML XSD schema that we don't show here. Note that Map and POJO representation can be considered equivalent for the purpose of this example.

Let us now assume that we have declared the event type MediaOrder as being represented by the root node <mediaorder> of such XML snip:

<mediaorder>
  <orderId>PO200901</orderId>
  <items>
    <item>
      <itemId>100001</itemId>
      <productId>B001</productId>
      <amount>10</amount>
      <price>11.95</price>
    </item>
  </items>
  <books>
    <book>
      <bookId>B001</bookId>
      <author>Heinlein</author>
      <review>
        <reviewId>1</reviewId>
        <comment>best book ever</comment>
      </review>
    </book>
    <book>
      <bookId>B002</bookId>
      <author>Isaac Asimov</author>
    </book>
  </books>
</mediaorder>

The next query utilizes the contained-event selection syntax to return each book:

select * from MediaOrder[books.book]

The result of the above query is one event per book. Output events contain only the book properties and not any of the mediaorder-level properties.

Note that, when using listeners, the engine delivers multiple results in one invocation of each listener. Therefore listeners to the above statement can expect a single invocation passing all book events within one media order event as an array.

To better illustrate the position of the contained-event selection syntax in a statement, consider the next two queries:

select * from MediaOrder(orderId='PO200901')[books.book]

The above query the returns each book only for media orders with a given order id. This query illustrates a contained-event selection and a view:

select count(*) from MediaOrder[books.book].std:unique(bookId)

The sample above counts each book unique by book id.

Contained-event selection can be staggered. When staggering multiple contained-event selections the staggered contained-event selection is relative to its parent.

This example demonstrates staggering contained-event selections by selecting each review of each book:

select * from MediaOrder[books.book][review]

Listeners to the query above receive a row for each review of each book. Output events contain only the review properties and not the book or media order properties.

The following is not valid:

// not valid
select * from MediaOrder[books.book.review]

The book property in an indexed property (an array or collection) and thereby requires an index in order to determine which book to use. The expression books.book[1].review is valid and means all reviews of the second (index 1) book.

The contained-event selection syntax is part of the filter expression and may therefore occur in patterns and anywhere a filter expression is valid.

A pattern example is below. The example assumes that a Cancel event type has been defined that also has an orderId property:

select * from pattern [c=Cancel -> books=MediaOrder(orderId = c.orderId)[books.book] ]

When used in a pattern, a filter with a contained-event selection returns an array of events, similar to the match-until clause in patterns. The above statement returns, in the books property, an array of book events.

The optional select clause provides control over which fields are available in output events. The expressions in the select-clause apply only to the properties available underneath the property in the from clause, and the properties of the enclosing event.

When no select is specified, only the properties underneath the selected property are available in output events.

In summary, the select clause may contain:

The next query's select clause selects each review for each book, and the order id as well as the book id of each book:

select * from MediaOrder[select orderId, bookId from books.book][select * from review]
// ... equivalent to ...
select * from MediaOrder[select orderId, bookId from books.book][review]]

Listeners to the statement above receive an event for each review of each book. Each output event has all properties of the review row, and in addition the bookId of each book and the orderId of the order. Thus bookId and orderId are found in each result event, duplicated when there are multiple reviews per book and order.

The above query uses wildcard (*) to select all properties from reviews. As has been discussed as part of the select clause, the wildcard (*) and property_alias.* do not copy properties for performance reasons. The wildcard syntax instead specifies the underlying type, and additional properties are added onto that underlying type if required. Only one wildcard (*) and property_alias.* (unless used with a column rename) may therefore occur in the select clause list of expressions.

All the following queries produce an output event for each review of each book. The next sample queries illustrate the options available to control the fields of output events.

The output events produced by the next query have all properties of each review and no other properties available:

select * from MediaOrder[books.book][review]

The following query is not a valid query, since the order id and book id are not part of the contained-event selection:

// Invalid select-clause: orderId and bookId not produced.
select orderId, bookId from MediaOrder[books.book][review]

This query is valid. Note that output events carry only the orderId and bookId properties and no other data:

select orderId, bookId from MediaOrder[books.book][select orderId, bookId from review]
//... equivalent to ...
select * from MediaOrder[select orderId, bookId from books.book][review]

This variation produces output events that have all properties of each book and only reviewId and comment for each review:

select * from MediaOrder[select * from books.book][select reviewId, comment from review]
// ... equivalent to ...
select * from MediaOrder[books.book as book][select book.*, reviewId, comment from review]

The output events of the next EPL have all properties of the order and only bookId and reviewId for each review:

select * from MediaOrder[books.book as book]
    [select mediaOrder.*, bookId, reviewId from review] as mediaOrder

This EPL produces output events with 3 columns: a column named mediaOrder that is the order itself, a column named book for each book and a column named review that holds each review:

insert into ReviewStream select * from MediaOrder[books.book as book]
  [select mo.* as mediaOrder, book.* as book, review.* as review from review as review] as mo
// .. and a sample consumer of ReviewStream...
select mediaOrder.orderId, book.bookId, review.reviewId from ReviewStream

Please note these limitations:

This section discusses contained-event selection in joins.

When joining within the same event it is not required that views are specified. Recall, in a join or outer join there must be views specified that hold the data to be joined. For self-joins, no views are required and the join executes against the data returned by the same event.

This query inner-joins items to books where book id matches the product id:

select book.bookId, item.itemId 
from MediaOrder[books.book] as book, 
      MediaOrder[items.item] as item 
where productId = bookId

Query results for the above query when sending the media order event as shown earlier are:

book.bookIditem.itemId
B001100001

The next example query is a left outer join. It returns all books and their items, and for books without item it returns the book and a null value:

select book.bookId, item.itemId 
from MediaOrder[books.book] as book 
  left outer join 
    MediaOrder[items.item] as item 
  on productId = bookId

Query results for the above query when sending the media order event as shown earlier are:

book.bookIditem.itemId
B001100001
B002null

A full outer join combines the results of both left and right outer joins. The joined table will contain all records from both tables, and fill in null values for missing matches on either side.

This example query is a full outer join, returning all books as well as all items, and filling in null values for book id or item id if no match is found:

select orderId, book.bookId,item.itemId 
from MediaOrder[books.book] as book 
  full outer join 
     MediaOrder[select orderId, * from items.item] as item 
  on productId = bookId 
order by bookId, item.itemId asc

As in all other continuous queries, aggregation results are cumulative from the time the statement was created.

The following query counts the cumulative number of items in which the product id matches a book id:

select count(*) 
from MediaOrder[books.book] as book, 
      MediaOrder[items.item] as item 
where productId = bookId

The unidirectional keyword in a join indicates to the query engine that aggregation state is not cumulative. The next query counts the number of items in which the product id matches a book id for each event:

select count(*) 
from MediaOrder[books.book] as book unidirectional, 
      MediaOrder[items.item] as item 
where productId = bookId

The update istream statement allows declarative modification of event properties of events entering a stream. Update is a pre-processing step to each new event, modifying an event before the event applies to any statements.

The synopsis of update istream is as follows:

update istream event_type [as stream_name]
  set property_name = set_expression [, property_name = set_expression] [,...]
  [where where_expression]

The event_type is the name of the type of events that the update applies to. The optional as keyword can be used to assign a name to the event type for use with subqueries, for example. Following the set keyword is a comma-separated list of property names and expressions that provide the event properties to change and values to set.

The optional where clause and expression can be used to filter out events to which to apply updates.

Listeners to an update statement receive the updated event in the insert stream (new data) and the event prior to the update in the remove stream (old data). Note that if there are multiple update statements that all apply to the same event then the engine will ensure that the output events delivered to listeners or subscribers are consistent with the then-current updated properties of the event (if necessary making event copies, as described below, in the case that listeners are attached to update statements). Iterating over an update statement returns no events.

As an example, the below statement assumes an AlertEvent event type that has properties named severity and reason:

update istream AlertEvent 
  set severity = 'High'
  where severity = 'Medium' and reason like '%withdrawal limit%'

The statement above changes the value of the severity property to "High" for AlertEvent events that have a medium severity and contain a specific reason text.

Update statements apply the changes to event properties before other statements receive the event(s) for processing, e.g. "select * from AlertEvent" receives the updated AlertEvent. This is true regardless of the order in which your application creates statements.

When multiple update statements apply to the same event, the engine executes updates in the order in which update statements are created. We recommend the @Priority EPL annotation to define a deterministic order of processing updates, especially in the case where update statements get created and destroyed dynamically or multiple update statements update the same fields. The update statement with the highest @Priority value applies last.

The update clause can be used on streams populated via insert into, as this example utilizing a pattern demonstrates:

insert into DoubleWithdrawalStream 
select a.id, b.id, a.account as account, 0 as minimum 
from pattern [a=Withdrawal -> b=Withdrawal(id = a.id)]
update istream DoubleWithdrawalStream set minimum = 1000 where account in (10002, 10003)

When using update istream with named windows, any changes to event properties apply before an event enters the named window. The update istream is not available for tables.

Consider the next example (shown here with statement names in @Name EPL annotation, multiple EPL statements):

@Name("CreateWindow") create window MyWindow.win:time(30 sec) as AlertEvent

@Name("UpdateStream") update istream MyWindow set severity = 'Low' where reason = '%out of paper%'

@Name("InsertWindow") insert into MyWindow select * from AlertEvent

@Name("SelectWindow") select * from MyWindow

The UpdateStream statement specifies an update clause that applies to all events entering the named window. Note that update does not apply to events already in the named window at the time an application creates the UpdateStream statement, it only applies to new events entering the named window (after an application created the update statement).

Therefore, in the above example listeners to the SelectWindow statement as well as the CreateWindow statement receive the updated event, while listeners to the InsertWindow statement receive the original AlertEvent event (and not the updated event).

Subqueries can also be used in all expressions including the optional where clause.

This example demonstrates a correlated subquery in an assignment expression and also demonstrates the optional as keyword. It assigns the phone property of an AlertEvent event a new value based on the lookup within all unique PhoneEvent events (according to an empid property) correlating the AlertEvent property reporter with the empid property of PhoneEvent:

update istream AlertEvent as ae
  set phone = 
    (select phone from PhoneEvent.std:unique(empid) where empid = ae.reporter)

When updating indexed properties use the syntax propertyName[index] = value with the index value being an integer number. When updating mapped properties use the syntax propertyName(key) = value with the key being a string value.

When using update, please note these limitations:

The engine delivers all result events of a given statement to the statement's listeners and subscriber (if any) in a single invocation of each listener and subscriber's update method passing an array of result events. For example, a statement using a time-batch view may provide many result events after a time period passes, a pattern may provide multiple matching events or in a join the join cardinality could be multiple rows.

For statements that typically post multiple result events to listeners the for keyword controls the number of invocations of the engine to listeners and subscribers and the subset of all result events delivered by each invocation. This can be useful when your application listener or subscriber code expects multiple invocations or expects that invocations only receive events that belong together by some additional criteria.

The for keyword is a reserved keyword. It is followed by either the grouped_delivery keyword for grouped delivery or the discrete_delivery keyword for discrete delivery. The for clause is valid after any EPL select statement.

The synopsis for grouped delivery is as follows:

... for grouped_delivery (group_expression [, group_expression] [,...])

The group_expression expression list provides one or more expressions to apply to result events. The engine invokes listeners and subscribers once for each distinct set of values returned by group_expression expressions passing only the events for that group.

The synopsis for discrete delivery is as follows:

... for discrete_delivery

With discrete delivery the engine invokes listeners and subscribers once for each result event passing a single result event in each invocation.

Consider the following example without for-clause. The time batch data view collects RFIDEvent events for 10 seconds and posts an array of result events:

select * from RFIDEvent.win:time_batch(10 sec)

Let's consider an example event sequence as follows:


Without for-clause and after the 10-second time period passes, the engine delivers an array of 3 events in a single invocation to listeners and the subscriber.

The next example specifies the for-clause and grouped delivery by zone:

select * from RFIDEvent.win:time_batch(10 sec) for grouped_delivery (zone)

With grouped delivery and after the 10-second time period passes, the above statement delivers result events in two invocations to listeners and the subscriber: The first invocation delivers an array of two events that contains zone A events with id 1 and 3. The second invocation delivers an array of 1 event that contains a zone B event with id 2.

The next example specifies the for-clause and discrete delivery:

select * from RFIDEvent.win:time_batch(10 sec) for discrete_delivery

With discrete delivery and after the 10-second time period passes, the above statement delivers result events in three invocations to listeners and the subscriber: The first invocation delivers an array of 1 event that contains the event with id 1, the second invocation delivers an array of 1 event that contains the event with id 2 and the third invocation delivers an array of 1 event that contains the event with id 3.

Remove stream events are also delivered in multiple invocations, one for each group, if your statement selects remove stream events explicitly via irstream or rstream keywords.

The insert into for inserting events into a stream is not affected by the for-clause.

The delivery order respects the natural sort order or the explicit sort order as provided by the order by clause, if present.

The following are known limitations:

  1. The engine validates group_expression expressions against the output event type, therefore all properties specified in group_expression expressions must occur in the select clause.

A named window is a globally-visible data window. A table is a globally-visible data structure organized by primary key or keys.

Named windows and tables both offer a way to share state between statements. Named windows and tables have differing capabilities and semantics.

To query a named window or table, simply use the named window name or table name in the from clause of your statement, including statements that contain subqueries, joins and outer-joins.

Certain clauses operate on either a named window or a table, namely the on-merge, on-update, on-delete and on-select clauses. The fire-and-forget queries also operate on both named windows and tables.

Both named windows and tables can have columns that hold events as column values, as further described in Section 6.12, “Events As Property”.

As a general rule-of-thumb, if you need to share a data window between statements, the named window is the right approach. If however rows are organized by primary key or hold aggregation state, a table may be preferable. EPL statements allow the combined use of both.

The create window statement creates a named window by specifying a window name and one or more data window views, as well as the type of event to hold in the named window.

There are two syntaxes for creating a named window: The first syntax allows modeling a named window after an existing event type or an existing named window. The second syntax is similar to the SQL create-table syntax and provides a list of column names and column types.

A new named window starts up empty. It must be explicitly inserted into by one or more statements, as discussed below. A named window can also be populated at time of creation from an existing named window.

If your application stops or destroys the statement that creates the named window, any consuming statements no longer receive insert or remove stream events. The named window can also not be deleted from after it was stopped or destroyed.

The create window statement posts to listeners any events that are inserted into the named window as new data. The statement posts all deleted events or events that expire out of the data window to listeners as the remove stream (old data). The named window contents can also be iterated on via the pull API to obtain the current contents of a named window.

The benefit of modeling a named window after an existing event type is that event properties can be nested, indexed, mapped or other types that your event objects may provide as properties, including the type of the underlying event itself. Also, using the wildcard (*) operator means your EPL does not need to list each individual property explicitly.

The syntax for creating a named window by modeling the named window after an existing event type, is as follows:

[context context_name] 
		create window window_name.view_specifications 
		[as] [select list_of_properties from] event_type_or_windowname
		[insert [where filter_expression]]

The window_name you assign to the named window can be any identifier. The name should not already be in use as an event type or stream name or table name.

The view_specifications are one or more data window views that define the expiry policy for removing events from the data window. Named windows must explicitly declare a data window view. This is required to ensure that the policy for retaining events in the data window is well defined. To keep all events, use the keep-all view: It indicates that the named window should keep all events and only remove events from the named window that are deleted via the on delete clause. The view specification can only list data window views, derived-value views are not allowed since these don't represent an expiry policy. Data window views are listed in Chapter 13, EPL Reference: Views. View parameterization and staggering are described in Section 5.4.3, “Specifying Views”.

The select clause and list_of_properties are optional. If present, they specify the column names and, implicitly by definition of the event type, the column types of events held by the named window. Expressions other than column names are not allowed in the select list of properties. Wildcards (*) and wildcards with additional properties can also be used.

The event_type_or_windowname is required if using the model-after syntax. It provides the name of the event type of events held in the data window, unless column names and types have been explicitly selected via select. The name of an (existing) other named window is also allowed here. Please find more details in Section 6.2.1.4, “Populating a Named Window from an Existing Named Window”.

Finally, the insert clause and optional filter_expression are used if the new named window is modelled after an existing named window, and the data of the new named window is to be populated from the existing named window upon creation. The optional filter_expression can be used to exclude events.

You may refer to a context by specifying the context keyword followed by a context name. Contexts are described in more detail at Chapter 4, Context and Context Partitions. The effect of referring to a context is that your named window operates according to the context dimensional information as declared for the context. For usage and limitations please see the respective chapter.

The next statement creates a named window OrdersNamedWindow for which the expiry policy is simply to keep all events. Assume that the event type 'OrderMapEventType' has been configured. The named window is to hold events of type 'OrderMapEventType':

create window OrdersNamedWindow.win:keepall() as OrderMapEventType

The below sample statement demonstrates the select syntax. It defines a named window in which each row has the three properties 'symbol', 'volume' and 'price'. This named window actively removes events from the window that are older than 30 seconds.

create window OrdersTimeWindow.win:time(30 sec) as 
		select symbol, volume, price from OrderEvent

In an alternate form, the as keyword can be used to rename columns, and constants may occur in the select-clause as well:

create window OrdersTimeWindow.win:time(30 sec) as 
  select symbol as sym, volume as vol, price, 1 as alertId from OrderEvent

The second syntax for creating a named window is by supplying column names and types:

[context context_name] 
create window window_name.view_specifications [as] (column_name column_type 
  [,column_name column_type [,...])

The column_name is an identifier providing the event property name. The column_type is also required for each column. Valid column types are listed in Section 5.17.1, “Creating Variables: the Create Variable clause” and are the same as for variable types.

For attributes that are array-type append [] (left and right brackets).

The next statement creates a named window:

create window SecurityEvent.win:time(30 sec) 
(ipAddress string, userId String, numAttempts int, properties String[])

Named window columns can hold events by declaring the column type as the event type name. Array-type in combination with event-type is also supported.

The next two statements declare an event type and create a named window with a column of the defined event type:

create schema SecurityData (name String, roles String[])
create window SecurityEvent.win:time(30 sec) 
    (ipAddress string, userId String, secData SecurityData, historySecData SecurityData[])

Whether the named window uses a Map or Object-array event representation for the rows can be specified as follows. If the create-window statement provides the @EventRepresentation(array=true) annotation the engine maintains named window rows as object array. If the statement provides the @EventRepresentation(array=false) annotation the engine maintains named window rows using Map objects. If neither annotation is provided, the engine uses the configured default event representation as discussed in Section 16.4.11.1, “Default Event Representation”.

The following EPL statement instructs the engine to represent FooWindow rows as object arrays:

@EventRepresentation(array=true) create window FooWindow.win:time(5 sec) as (string prop1)

The insert into clause inserts events into named windows. Your application must ensure that the column names and types match the declared column names and types of the named window to be inserted into.

For inserting into a named window and for simultaneously checking if the inserted row already exists in the named window or for atomic update-insert operation on a named window, consider using on-merge as described in Section 6.8, “Triggered Upsert using the On-Merge Clause”. On-merge is similar to the SQL merge clause and provides what is known as an "Upsert" operation: Update existing events or if no existing event(s) are found then insert a new event, all in one atomic operation provided by a single EPL statement.

In this example we first create a named window using some of the columns of an OrderEvent event type:

create window OrdersWindow.win:keepall() as select symbol, volume, price from OrderEvent

The insert into the named window selects individual columns to be inserted:

insert into OrdersWindow(symbol, volume, price) select name, count, price from FXOrderEvent

An alternative form is shown next:

insert into OrdersWindow select name as symbol, vol as volume, price from FXOrderEvent

Following above statement, the engine enters every FXOrderEvent arriving into the engine into the named window 'OrdersWindow'.

The following EPL statements create a named window for an event type backed by a Java class and insert into the window any 'OrderEvent' where the symbol value is IBM:

create window OrdersWindow.win:time(30) as com.mycompany.OrderEvent
insert into OrdersWindow select * from com.mycompany.OrderEvent(symbol='IBM')

The last example adds one column named 'derivedPrice' to the 'OrderEvent' type by specifying a wildcard, and uses a user-defined function to populate the column:

create window OrdersWindow.win:time(30) as select *, price as derivedPrice from OrderEvent
insert into OrdersWindow select *, MyFunc.func(price, percent) as derivedPrice from OrderEvent

Event representations based on Java base classes or interfaces, and subclasses or implementing classes, are compatible as these statements show:

// create a named window for the base class
create window OrdersWindow.std:unique(name) as select * from ProductBaseEvent
// The ServiceProductEvent class subclasses the ProductBaseEvent class
insert into OrdersWindow select * from ServiceProductEvent
// The MerchandiseProductEvent class subclasses the ProductBaseEvent class
insert into OrdersWindow select * from MerchandiseProductEvent

To avoid duplicate events inserted in a named window and atomically check if a row already exists, use on-merge as outlined in Section 6.8, “Triggered Upsert using the On-Merge Clause”. An example:

on ServiceProductEvent as spe merge OrdersWindow as win
where win.id = spe.id when not matched then insert select *

Decorated events hold an underlying event and add additional properties to the underlying event, as described further in Section 5.10.4, “Decorated Events”.

Here we create a named window that decorates OrderEvent events by adding an additional property named priceTotal to each OrderEvent. A matching insert into statement is also part of the sample:

create window OrdersWindow.win:time(30) as select *, price as priceTotal from OrderEvent
insert into OrdersWindow select *, price * unit as priceTotal from ServiceOrderEvent

The property type of the additional priceTotal column is the property type of the existing price property of OrderEvent.

A named window can be referred to by any statement in the from clause of the statement. Filter criteria can also be specified. Additional views may be used onto named windows however such views cannot include data window views.

A statement selecting all events from a named window OrdersNamedWindow is shown next. The named window must first be created via the create window clause before use.

select * from OrdersNamedWindow

The statement as above simply receives the unfiltered insert stream of the named window and reports that stream to its listeners. The iterator method returns all events in the named window, if any.

If your application desires to obtain the events removed from the named window, use the rstream keyword as this statement shows:

select rstream * from OrdersNamedWindow

The next statement derives an average price per symbol for the events held by the named window:

select symbol, avg(price) from OrdersNamedWindow group by symbol

A statement that consumes from a named window, like the one above, receives the insert and remove stream of the named window. The insert stream represents the events inserted into the named window. The remove stream represents the events expired from the named window data window and the events explicitly deleted via on-delete for on-demand (fire-and-forget) delete.

Your application may create a consuming statement such as above on an empty named window, or your application may create the above statement on an already filled named window. The engine provides correct results in either case: At the time of statement creation the Esper engine internally initializes the consuming statement from the current named window, also taking your declared filters into consideration. Thus, your statement deriving data from a named window does not start empty if the named window already holds one or more events. A consuming statement also sees the remove stream of an already populated named window, if any.

If you require a subset of the data in the named window, you can specify one or more filter expressions onto the named window as shown here:

select symbol, avg(price) from OrdersNamedWindow(sector='energy') group by symbol

By adding a filter to the named window, the aggregation and grouping as well as any views that may be declared onto to the named window receive a filtered insert and remove stream. The above statement thus outputs, continuously, the average price per symbol for all orders in the named window that belong to a certain sector.

A side note on variables in filters filtering events from named windows: The engine initializes consuming statements at statement creation time and changes aggregation state continuously as events arrive. If the filter criteria contain variables and variable values changes, then the engine does not re-evaluate or re-build aggregation state. In such a case you may want to place variables in the having clause which evaluates on already-built aggregation state.

The following example further declares a view into the named window. Such a view can be a plug-in view or one of the built-in views, but cannot be a data window view (with the exception of the std:groupwin grouped-window view which is allowed).

select * from OrdersNamedWindow(volume>0, price>0).mycompany:mypluginview()

Data window views cannot be used onto named windows since named windows post insert and remove streams for the events entering and leaving the named window, thus the expiry policy and batch behavior are well defined by the data window declared for the named window. For example, the following is not allowed and fails at time of statement creation:

// not a valid statement
	select * from OrdersNamedWindow.win:time(30 sec)

The create table statement creates a table.

A new table starts up empty. It must be explicitly aggregated-into using into table, or populated by an on-merge statement, or populated by insert into.

The syntax for creating a table provides the table name, lists column names and types and designates primary key columns:

[context context_name] 
create table table_name [as] (column_name column_type [primary key]
  [,column_name column_type [primary key] [,...]])

The table_name you assign to the table can be any identifier. The name should not already be in use as an event type or named window name.

You may refer to a context by specifying the context keyword followed by a context name. Contexts are described in more detail at Chapter 4, Context and Context Partitions. The effect of referring to a context is that your table operates according to the context dimensional information as declared for the context. For usage and limitations please see the respective chapter.

The column_name is an identifier providing the column name.

The column_type is required for each column. There are two categories of column types:

  1. Non-aggregating column types: Valid column types are listed in Section 5.17.1, “Creating Variables: the Create Variable clause” and are the same as for variable types. For attributes that are array-type append [] (left and right brackets). Table columns can hold events by declaring the column type as the event type name. Array-type in combination with event-type is also supported.

  2. Aggregation column types: These instruct the engine to retain aggregation state.

After each column type you may add the primary key keywords. This keyword designates the column as a primary key. When multiple columns are designated as primary key columns the combination of column values builds a compound primary key. The order in which the primary key columns are listed is important.

The next statement creates a table to hold a numAttempts count aggregation state and a column named active of type boolean, per ipAddress and userId:

create table SecuritySummaryTable (
  ipAddress string primary key,
  userId String primary key, 
  numAttempts count(*),
  active boolean)

The example above specifies ipAddress and userId as primary keys. This instructs the engine that the table holds a single row for each distinct combination of ipAddress and userId. The two values make up the compound key and there is a single row per compound key value.

If you do not designate any columns of the table as a primary key column, the table holds only one row (or no rows).

The create table statement does not provide output to its listeners. The table contents can be iterated on via the pull API to obtain the current contents of a table.

All aggregation functions can be used as column types for tables. Please simply list the aggregation function name as the column type and provide type information, when required. See Section 10.2.1, “SQL-Standard Functions” for a list of the functions and required parameter expressions for which you must provide type information.

Consider the next example that declares a table with columns for different aggregation functions (not a comprehensive example of all possible aggregation functions):

create table MyStats (
  myKey string primary key,
  myAvedev avedev(int), // column holds a mean deviation of int-typed values
  myAvg avg(double), // column holds an average of double-typed values
  myCount count(*), // column holds a count
  myMax max(int), // column holds a highest int-typed value
  myMedian median(float), // column holds the median of float-typed values
  myStddev stddev(java.math.BigDecimal), // column holds a standard deviation of BigDecimal values
  mySum sum(long), // column holds a sum of long values
  myFirstEver firstever(string), // column holds a first-ever value of type string
  myCountEver countever(*) // column holds the count-ever (regardless of data windows)
)

Additional keywords such as distinct can be used as well. If your aggregation will be associated with a filter expression, you must add boolean to the parameters in the column type declaration.

For example, the next EPL declares a table with aggregation-type columns that hold an average of filtered double-typed values and an average of distinct double-typed values:

create table MyStatsMore (
  myKey string primary key,
  myAvgFiltered avg(double, boolean), // column holds an average of double-typed values
                      // and filtered by a boolean expression to be provided
  myAvgDistinct avg(distinct double) // column holds an average of distinct double-typed values
)

The event aggregation functions can be used as column types for tables. For event aggregation functions you must specify the event type using the @type(name) annotation.

The window event aggregation function requires the * wildcard. The first and last cannot be used in a declaration, please use window instead and access as described in Section 6.3.3.2, “Accessing Aggregation State With The Dot Operator”.

The sorted, maxbyever and minbyever event aggregation functions require the criteria expression as a parameter. The criteria expression must only use properties of the provided event type. The maxby and minby cannot be used in a declaration, please use sorted instead and access as described in Section 6.3.3.2, “Accessing Aggregation State With The Dot Operator”.

In this example the table declares sample event aggregations (not a comprehensive example of all possible aggregations):

create table MyEventAggregationTable (
  myKey string primary key,
  myWindow window(*) @type(MyEvent), // column holds a window of MyEvent events
  mySorted sorted(mySortValue) @type(MyEvent), // column holds MyEvent events sorted by mySortValue
  myMaxByEver maxbyever(mySortValue) @type(MyEvent) // column holds the single MyEvent event that 
        // provided the highest value of mySortValue ever
)

Use the into table keywords to instruct the engine to aggregate into table columns. A given statement can only aggregate into a single table.

For example, consider a table that holds the count of intrusion events keyed by the combination of from-address and to-address:

create table IntrusionCountTable (
  fromAddress string primary key,
  toAddress string primary key,
  countIntrusion10Sec count(*),
  countIntrusion60Sec count(*)
)

The next sample statement updates the count considering the last 10 seconds of events:

into table IntrusionCountTable
select count(*) as countIntrusion10Sec
from IntrusionEvent.win:time(10)
group by fromAddress, toAddress

Multiple statements can aggregate into the same table columns or different table columns. The co-aggregating ability allows you to co-locate aggregation state conveniently.

The sample shown below is very similar to the previous statement except that it updates the count considering the last 60 seconds of events:

into table IntrusionCountTable
select count(*) as countIntrusion60Sec
from IntrusionEvent.win:time(60)
group by fromAddress, toAddress

Considering the example above, when an intrusion event arrives and a row for the group-by key values (from and to-address) does not exists, the engine creates a new row and updates the aggregation-type columns. If the row for the group-by key values exists, the engine updates the aggregation-type columns of the existing row.

Tables can have no primary key columns. In this case a table either has a single row or is empty.

The next two EPL statements demonstrate table use without a primary key column:

create table TotalIntrusionCountTable (totalIntrusions count(*))
into table TotalIntrusionCountTable select count(*) as totalIntrusions from IntrusionEvent

In conjunction with into table the unidirectional keyword is not supported.

For accessing table columns by primary key, EPL provides a convenient syntax that allows you to read table column values simply by providing the table name, primary key value expressions (if required by the table) and the column name.

The synopsis for table-column access expressions is:

table-name[primary_key_expr [, primary_key_expr] [,...]][.column-name]

The expression starts with the table name. If the table declares primary keys you must provide the primary_key_expr value expressions for each primary key within square brackets. To access a specific column, add the (.) dot character and the column name.

For example, consider a table that holds the count of intrusion events keyed by the combination of from-address and to-address:

create table IntrusionCountTable (
  fromAddress string primary key,
  toAddress string primary key,
  countIntrusion10Sec count(*)
)

Assuming that a FireWallEvent has string-type properties named from and to, the next EPL statement outputs the current 10-second intrusion count as held by the IntrusionCountTable row for the matching combination of keys:

select IntrusionCountTable[from, to].countIntrusion10Sec from FirewallEvent

The number of primary key expressions, the return type of the primary key expressions and the order in which they are provided must match the primary key columns that were declared for the table. If the table does not have any primary keys declared, you cannot provide any primary key expressions.

If a row for the primary key (or compound key) cannot be found, the engine returns a null value.

An example table without primary key columns is shown next:

create table TotalIntrusionCountTable (totalIntrusions count(*))

A sample statement that outputs the current total count every 60 seconds is:

select TotalIntrusionCountTable.totalIntrusions from pattern[every timer:interval(60 sec)]

Table access expressions can be used anywhere in statements except as parameter expressions for data windows, the update istream, context declarations, output limit expressions, pattern observer and guard parameters, pattern every-distinct, pattern match-until bounds, pattern followed-by max and create window insert or select expression and as a create variable assignment expression.

The insert into clause inserts rows into a table. Your application must ensure that the column names and types match the declared column names and types of the table to be inserted into, when provided.

For inserting into a table and for simultaneously checking if the inserted row already exists in the table or for atomic update-insert operation on a table, consider using on-merge as described in Section 6.8, “Triggered Upsert using the On-Merge Clause”. On-merge is similar to the SQL merge clause and provides what is known as an "Upsert" operation: Update existing rows or if no existing rows(s) are found then insert a new row, all in one atomic operation provided by a single EPL statement.

The following statement populates the example table declared earlier:

insert into IntrusionCountTable select fromAddress, toAddress from FirewallEvent

Note that when a row with the same primary key values already exists, your statement may encounter a unique index violation at runtime. If the inserted-into table does not have primary key columns, the table holds a maximum of one row and your statement may also encounter a unique index violation upon attempting to insert a second row. Use on-merge to prevent inserts of duplicate rows.

Table columns that are aggregation functions cannot be inserted-into and must be updated using into table instead.

You may also explicitly list column names as discussed earlier in Section 6.2.2, “Inserting Into Named Windows”. For insert-into, the context name must be the same context name as declared for the create table statement or the context name must be absent for both.

A table can be referred to by any statement in the from-clause of the statement.

Tables do not provide an insert and remove stream. When a table appears alone in the from-clause (other than as part of a subquery), the statement produces output only when iterated (see pull API) or when executing an on-demand (fire-and-forget) query.

Assuming you have declared a table by name IntrusionCountTable as shown earlier, the following statement only returns rows when iterated or when executing the EPL as an on-demand query or when adding an output snapshot:

select * from IntrusionCountTable

For tables, the contained-event syntax and the declaration of views is not supported. In a join, a table in the from-clause cannot be marked as unidirectional. You may not specify any of the retain-flags. Tables cannot be used in the from-clause of match-recognize statements, in context declarations, in pattern filter atoms and update istream.

The following are examples of invalid statements:

// invalid statement examples
select * from IntrusionCountTable.win:time(30 sec)   // views not allowed
select * from IntrusionCountTable unidirectional, MyEvent   // tables cannot be marked as unidirectional

Tables can be used in subqueries and joins.

It follows a sample subselect and join against the table:

select
  (select * from IntrusionCountTable as intr
   where intr.fromAddress = firewall.fromAddress and intr.toAddress = firewall.toAddress) 
from IntrusionEvent as firewall
select * from IntrusionCountTable as intr, IntrusionEvent as firewall
where intr.fromAddress = firewall.fromAddress and intr.toAddress = firewall.toAddress

If the subselect or join specifies all of a table's primary key columns, please consider using the table-access expression instead. It offers a more concise syntax.

Note that for a subquery against a table that may return multiple rows, the information about subquery multi-row selection applies. For subselects, consider using @eventbean to preserve table type information in the output event.

Note that for joins against tables the engine does not allow specifying table filter expressions in parenthesis, in the from clause. Filter expressions must instead be placed into the where-clause.

You may access aggregation state the same way as in table-access expressions, using the dot (.) operator.

The EPL shown below declares a table that keeps a set of events, and shows a join that selects window aggregation state:

create table MyWindowTable (theWindow window(*) @type(MyEvent))
select theWindow.first(), theWindow.last(), theWindow.window() from MyEvent, MyWindowTable

The on select clause performs a one-time, non-continuous query on a named window or table every time a triggering event arrives or a triggering pattern matches. The query can consider all rows, or only rows that match certain criteria, or rows that correlate with an arriving event or a pattern of arriving events.

The syntax for the on select clause is as follows:

on event_type[(filter_criteria)] [as stream_name]
[insert into insert_into_def]
select select_list
from window_or_table_name [as stream_name]
[where criteria_expression]
[group by grouping_expression_list]
[having grouping_search_conditions]
[order by order_by_expression_list]

The event_type is the name of the type of events that trigger the query against the named window or table. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as keyword can be used to assign a stream name. Patterns or named windows can also be specified in the on clause, see the samples in Section 6.7.1, “Using Patterns in the On Delete Clause” (for a named window as a trigger only insert stream events trigger actions) (tables cannot be triggers).

The insert into clause works as described in Section 5.10, “Merging Streams and Continuous Insertion: the Insert Into Clause”. The select clause is described in Section 5.3, “Choosing Event Properties And Events: the Select Clause”. For all clauses the semantics are equivalent to a join operation: The properties of the triggering event or events are available in the select clause and all other clauses.

The window_or_table_name in the from clause is the name of the named window or table to select rows from. The as keyword is also available to assign a stream name to the table or named window. The as keyword is helpful in conjunction with wildcard in the select clause to select rows via the syntax select streamname.* .

The optional where clause contains a criteria_expression that correlates the arriving (triggering) event to the rows to be considered from the table or named window. The criteria_expression may also simply filter for rows to be considered by the query.

The group by clause, the having clause and the order by clause are all optional and work as described in earlier chapters.

Queries against tables and named windows work the same. The examples herein use the OrdersNamedWindow named window and the SecuritySummaryTable table to provide examples for each.

The sample statement below outputs, when a query event arrives, the count of all rows held by the SecuritySummaryTable table:

on QueryEvent select count(*) from SecuritySummaryTable

This sample query outputs the total volume per symbol ordered by symbol ascending and only non-zero volumes of all rows held by the OrdersNamedWindow named window:

on QueryEvent
select symbol, sum(volume) from OrdersNamedWindow
group by symbol having volume > 0 order by symbol

When using wildcard (*) to select from streams in an on-select clause, each stream, that is the triggering stream and the selected-upon table or named window, are selected, similar to a join. Therefore your wildcard select returns two columns: the triggering event and the selection result row, for each row.

on QueryEvent as queryEvent
select * from OrdersNamedWindow as win

The query above returns a queryEvent column and a win column for each event. If only a single stream's event is desired in the result, use select win.* instead.

Upon arrival of a QueryEvent event, this statement selects all rows in the OrdersNamedWindow named window:

on QueryEvent select win.* from OrdersNamedWindow as win

The engine executes the query on arrival of a triggering event, in this case a QueryEvent. It posts the query results to any listeners to the statement, in a single invocation, as the new data array.

The where clause filters and correlates rows in the table or named window with the triggering event, as shown next:

on QueryEvent(volume>0) as query
select query.symbol, query.volume, win.symbol  from OrdersNamedWindow as win
where win.symbol = query.symbol

Upon arrival of a QueryEvent, if that event has a value for the volume property that is greater than zero, the engine executes the query. The query considers all events currently held by the OrdersNamedWindow that match the symbol property value of the triggering QueryEvent event.

An on update clause updates rows held by a table or named window. The clause can be used to update all rows, or only rows that match certain criteria, or rows that correlate with an arriving event or a pattern of arriving events.

For updating a table or named window and for simultaneously checking if the updated row exists or for atomic update-insert operation on a named window or table, consider using on-merge as described in Section 6.8, “Triggered Upsert using the On-Merge Clause”. On-merge is similar to the SQL merge clause and provides what is known as an "Upsert" operation: Update existing events or if no existing event(s) are found then insert a new event, all in one atomic operation provided by a single EPL statement.

The syntax for the on update clause is as follows:

on event_type[(filter_criteria)] [as stream_name]
update window_or_table_name [as stream_name]
set mutation_expression [, mutation_expression [,...]]
[where criteria_expression]

The event_type is the name of the type of events that trigger an update of rows in a named window. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as keyword can be used to assign a name for use in expressions and the where clause. Patterns and named windows can also be specified in the on clause.

The window_or_table_name is the name of the table or named window to update rows. The as keyword is also available to assign a name to the named window or table.

After the set keyword follows a list of comma-separated mutation_expression expressions. A mutation expression is any valid EPL expression. Subqueries may by part of expressions however aggregation functions and the prev or prior function may not be used in expressions.

The below table shows some typical mutation expessions:


The optional where clause contains a criteria_expression that correlates the arriving (triggering) event to the rows to be updated in the table or named window. The criteria_expression may also simply filter for rows to be updated.

Queries against tables and named windows work the same. We use the term property and column interchangeably. The examples herein use the OrdersNamedWindow named window and the SecuritySummaryTable table to provide examples for each. Let's look at a couple of examples.

In the simplest form, this statement updates all rows in the named window OrdersNamedWindow when any UpdateOrderEvent event arrives, setting the price property to zero for all rows currently held by the named window:

on UpdateOrderEvent update OrdersNamedWindow set price = 0

This example demonstrates the use of a where clause and updates the SecuritySummaryTable table. Upon arrival of a triggering ResetEvent it updates the active column value to false for all table rows that have an active column value of true:

on ResetEvent update SecuritySummaryTable set active = false where active = true

The next example shows a more complete use of the syntax, and correlates the triggering event with rows held by the OrdersNamedWindow named window:

on NewOrderEvent(volume>0) as myNewOrders
update OrdersNamedWindow as myNamedWindow 
set price = myNewOrders.price
where myNamedWindow.symbol = myNewOrders.symbol

In the above sample statement, only if a NewOrderEvent event with a volume greater then zero arrives does the statement trigger. Upon triggering, all rows in the named window that have the same value for the symbol property as the triggering NewOrderEvent event are then updated (their price property is set to that of the arriving event). The statement also showcases the as keyword to assign a name for use in the where expression.

Your application can subscribe a listener to your on update statements to determine update events. The statement post any rows that are updated to all listeners attached to the statement as new data, and the events prior to the update as old data.

The following example shows the use of tags and a pattern. It sets the price value of orders to that of either a FlushOrderEvent or OrderUpdateEvent depending on which arrived:

on pattern [every ord=OrderUpdateEvent(volume>0) or every flush=FlushOrderEvent] 
update OrdersNamedWindow as win
set price = case when ord.price is null then flush.price else ord.price end
where ord.id = win.id or flush.id = win.id

When updating indexed properties use the syntax propertyName[index] = value with the index value being an integer number. When updating mapped properties use the syntax propertyName(key) = value with the key being a string value.

The engine executes assignments in the order they are listed. When performing multiple assignments, the engine takes the most recent column value according to the last assignment, if any. To instruct the engine to use the initial value before update, prefix the column name with the literal initial.

The following statement illustrates:

on UpdateEvent as upd
update MyWindow as win
set field_a = 1, 
  field_b = win.field_a, // assigns the value 1 
  field_c = initial.field_a // assigns the field_a original value before update

The next example assumes that your application provides a user-defined function copyFields that receives 3 parameters: The update event, the new row and the initial state before-update row.

on UpdateEvent as upd update MyWindow as win set copyFields(win, upd, initial)

You may invoke a method on a value object, for those properties that hold value objects, as follows:

on UpdateEvent update MyWindow as win set someproperty.clear()

For named windows only, you may also invoke a method on the named window event type.

The following example assumes that your event type provides a method by name populateFrom that receives the update event as a parameter:

on UpdateEvent as upd update MyWindow as win set win.populateFrom(upd)

The following restrictions apply:

  1. Each property to be updated via assignment must be writable. For tables, all columns are always writable.
  2. For underlying event representations that are Java objects, a event object class must implement the java.io.Serializable interface as discussed in Section 5.20.1, “Immutability and Updates” and must provide setter methods for updated properties.
  3. When using an XML underlying event type, event properties in the XML document representation are not available for update.
  4. Nested properties are not supported for update. Revision event types and variant streams may also not be updated.

An on delete clause removes rows from a named window or table. The clause can be used to remove all rows, or only rows that match certain criteria, or rows that correlate with an arriving event or a pattern of arriving events.

The syntax for the on delete clause is as follows:

on event_type[(filter_criteria)] [as stream_name]
delete from window_or_table_name [as stream_name]
[where criteria_expression]

The event_type is the name of the type of events that trigger removal from the table or named window. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as keyword can be used to assign a name for use in the where clause. Patterns and named windows can also be specified in the on clause as described in the next section.

The window_or_table_name is the name of the named window or table to delete rows from. The as keyword is also available to assign a name to the table or named window.

The optional where clause contains a criteria_expression that correlates the arriving (triggering) event to the rows to be removed. The criteria_expression may also simply filter for rows without correlating.

On-delete can be used against tables and named windows. The examples herein use the OrdersNamedWindow named window and the SecuritySummaryTable table to provide examples for each.

In the simplest form, this statement deletes all rows from the SecuritySummaryTable table when any ClearEvent arrives:

on ClearEvent delete from SecuritySummaryTable

The next example shows a more complete use of the syntax, and correlates the triggering event with events held by the OrdersNamedWindow named window:

on NewOrderEvent(volume>0) as myNewOrders
delete from OrdersNamedWindow as myNamedWindow 
where myNamedWindow.symbol = myNewOrders.symbol

In the above sample statement, only if a NewOrderEvent event with a volume greater then zero arrives does the statement trigger. Upon triggering, all rows in the named window that have the same value for the symbol property as the triggering NewOrderEvent event are removed. The statement also showcases the as keyword to assign a name for use in the where expression.

By means of patterns the on delete clause and on select clause (described below) can look for more complex conditions to occur, possibly involving multiple events or the passing of time. The syntax for on delete with a pattern expression is show next:

on pattern [pattern_expression] [as stream_name]
delete from window_or_table_name [as stream_name]
[where criteria_expression]

The pattern_expression is any pattern that matches zero or more arriving events. Tags can be used to name events in the pattern and can occur in the optional where clause to correlate to events to be removed from a named window.

In the next example the triggering pattern fires every 10 seconds. The effect is that every 10 seconds the statement removes all rows from the SecuritySummaryTable table:

on pattern [every timer:interval(10 sec)] delete from SecuritySummaryTable

The following example shows the use of tags in a pattern and executes against the OrdersNamedWindow named window instead:

on pattern [every ord=OrderEvent(volume>0) or every flush=FlushOrderEvent] 
delete from OrdersNamedWindow as win
where ord.id = win.id or flush.id = win.id

The pattern above looks for OrderEvent events with a volume value greater then zero and tags such events as 'ord'. The pattern also looks for FlushOrderEvent events and tags such events as 'flush'. The where clause deletes from the OrdersNamedWindow named window any rows that match in the value of the 'id' property either of the arriving events.

The on merge clause is similar to the SQL merge clause. It provides what is known as an "Upsert" operation: Update existing rows or if no existing row(s) are found then insert a new row, all in an atomic operation provided by a single EPL statement.

The syntax for the on merge clause is as follows:

on event_type[(filter_criteria)] [as stream_name]
merge [into] window_or_table_name [as stream_name]
[where criteria_expression]
  when [not] matched [and search_condition]
    then [
      insert [into streamname]
          [ (property_name [, property_name] [,...]) ] 
          select select_expression [, select_expression[,...]]
          [where filter_expression]
      |
      update set mutation_expression [, mutation_expression [,...]]
          [where filter_expression]
      |
      delete
          [where filter_expression]
    ]
    [then [insert|update|delete]] [,then ...]
  [when ...  then ... [...]] 

The event_type is the name of the type of events that trigger the merge. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as keyword can be used to assign a name for use in the where clause. Patterns and named windows can also be specified in the on clause as described in prior sections.

The window_or_table_name is the name of the named window or table to insert, update or delete rows. The as keyword is also available to assign a name to the named window or table.

The optional where clause contains a criteria_expression that correlates the arriving (triggering) event to the rows to be considered of the table or named window. We recommend specifying a criteria expression that is as specific as possible.

Following the where clause is one or more when matched or when not matched clauses in any order. Each may have an additional search condition associated.

After each when [not] matched follow one or more then clauses that each contains the action to take: Either an insert, update or delete keyword.

After when not matched only insert action(s) are available. After when matched any insert, update and delete action(s) are available.

After insert follows, optionally, the into keyword followed by the stream name or named window to insert-into. If no into and stream name is specified, the insert applies to the current table or named window. It follows an optional list of columns inserted. It follows the required select keyword and one or more select-clause expressions. The wildcard (*) is available in the select-clause as well. It follows an optional where-clause that may return Boolean false to indicate that the action should not be applied.

After update follows the set keyword and one or more mutation expressions. For mutation expressions please see Section 6.6, “Updating Data: the On Update clause”. It follows an optional where-clause that may return Boolean false to indicate that the action should not be applied.

After delete follows an optional where-clause that may return Boolean false to indicate that the action should not be applied.

When according to the where-clause criteria_expression the engine finds no rows in the named window or table that match the condition, the engine evaluates each when not matched clause. If the optional search condition returns true or no search condition was provided then the engine performs all of the actions listed after each then.

When according to the where-clause criteria_expression the engine finds one or more rows in the named window or table that match the condition, the engine evaluates each when matched clause. If the optional search condition returns true or no search condition was provided the engine performs all of the actions listed after each then.

The engine executes when matched and when not matched in the order specified. If the optional search condition returns true or no search condition was specified then the engine takes the associated action (or multiple actions for multiple then keywords). When the block of actions completed the engine proceeds to the next matching row, if any. After completing all matching rows the engine continues to the next triggering event if any.

On-merge can be used with tables and named windows. The examples herein declare a ProductWindow named window and also use the SecuritySummaryTable table to provide examples for each.

This example statement updates the SecuritySummaryTable table when a ResetEvent arrives setting the active column's value to false:

on ResetEvent merge SecuritySummaryTable
when matched and active = true then update set active = false

A longer example utilizing a named window follows. We start by declaring a schema that provides a product id and that holds a total price:

create schema ProductTotalRec as (productId string, totalPrice double)

We create a named window that holds a row for each unique product:

create window ProductWindow.std:unique(productId) as ProductTotalRec

The events for this example are order events that hold an order id, product id, price, quantity and deleted-flag declared by the next schema:

create schema OrderEvent as (orderId string, productId string, price double, 
    quantity int, deletedFlag boolean)

The following EPL statement utilizes on-merge to total up the price for each product based on arriving order events:

on OrderEvent oe
  merge ProductWindow pw
  where pw.productId = oe.productId
  when matched
    then update set totalPrice = totalPrice + oe.price
  when not matched 
    then insert select productId, price as totalPrice

In the above example, when an order event arrives, the engine looks up in the product named window the matching row or rows for the same product id as the arriving event. In this example the engine always finds no row or one row as the product named window is declared with a unique data window based on product id. If the engine finds a row in the named window, it performs the update action adding up the price as defined under when matched. If the engine does not find a row in the named window it performs the insert action as defined under when not matched, inserting a new row.

The insert keyword may be followed by a list of columns as shown in this EPL snippet:

// equivalent to the insert shown in the last 2 lines in above EPL
...when not matched 
    then insert(productId, totalPrice) select productId, price

The second example demonstrates the use of a select-clause with wildcard, a search condition and the delete keyword. It creates a named window that holds order events and employs on-merge to insert order events for which no corresponding order id was found, update quantity to the quantity provided by the last arriving event and delete order events that are marked as deleted:

create window OrderWindow.win:keepall() as OrderEvent
on OrderEvent oe
  merge OrderWindow pw
  where pw.orderId = oe.orderId
  when not matched 
    then insert select *
  when matched and oe.deletedFlag=true
    then delete
  when matched
    then update set pw.quantity = oe.quantity, pw.price = oe.price

In the above example the oe.deletedFlag=true search condition instructs the engine to take the delete action only if the deleted-flag is set.

You may specify multiple actions by providing multiple then keywords each followed by an action. Each of the insert, update and delete actions can itself have a where-clause as well. If a where-clause exists for an action, the engine evaluates the where-clause and applies the action only if the where-clause returns Boolean true.

This example specifies two update actions and uses the where-clause to trigger different update behavior depending on whether the order event price is less than zero. This example assumes that the host application defined a clearorder user-defined function, to demonstrate calling a user-defined function as part of the update mutation expressions:

on OrderEvent oe
  merge OrderWindow pw
  where pw.orderId = oe.orderId
  when matched
    then update set clearorder(pw) where oe.price < 0
    then update set pw.quantity = oe.quantity, pw.price = oe.price where oe.price >= 0

To insert events into another stream and not the named window, use insert into streamname.

In the next example each matched-clause contains two actions, one action to insert a log event and a second action to insert, delete or update:

on OrderEvent oe
  merge OrderWindow pw
  where pw.orderId = oe.orderId
  when not matched 
    then insert into LogEvent select 'this is an insert' as name
    then insert select *
  when matched and oe.deletedFlag=true
    then insert into LogEvent select 'this is a delete' as name
    then delete
  when matched
    then insert into LogEvent select 'this is a update' as name
    then update set pw.quantity = oe.quantity, pw.price = oe.price

While the engine evaluates and executes all actions listed under the same matched-clause in order, you may not rely on updated field values of an earlier action to trigger the where-clause of a later action. Similarly you should avoid simultaneous update and delete actions for the same match: the engine does not guarantee whether the update or the delete take final affect.

Your application can subscribe a listener to on merge statements to determine inserted, updated and removed events. Statements post any events that are inserted to, updated or deleted from a named window to all listeners attached to the statement as new data and removed data.

The following limitations apply to on-merge statements:

  1. Aggregation functions and the prev and prior operators are not available in conditions and the select-clause.

You may explicitly create an index on a table or a named window. The engine considers explicitly-created as well as implicitly-allocated indexes (named windows only) in query planning and execution of the following types of usages of tables and named windows:

  1. On-demand (fire-and-forget, non-continuous) queries as described in Section 15.5, “On-Demand Fire-And-Forget Query Execution”.
  2. On-select, on-merge, on-update, on-delete and on-insert.

  3. Subqueries against tables and named windows.

  4. For joins (including outer joins) with named windows the engine considers the filter criteria listed in parenthesis using the syntax

    name_window_name(filter_criteria)

    for index access.

  5. For joins with tables the engine considers the primary key columns (if any) as well as any table indexes.

Please use the following syntax to create an explicit index on a named window or table:

create [unique] index index_name on window_or_table_name (property [hash|btree] 
    [, property] [hash|btree] [,...] )

The optional unique keyboard indicates that the property or properties uniquely identify rows. If unique is not specified the index allows duplicate rows.

The index_name is the name assigned to the index. The name uniquely identifies the index and is used in engine query plan logging.

The window_or_table_name is the name of an existing table or named window. If the named window or table has rows already, the engine builds an index for the rows.

The list of property names are the properties of rows to include in the index (we use the term property and column interchangeably). Following each property name you may specify the optional hash or btree keyword.

If you specify no keyword or the hash keyword for a property, the index will be a hash-based (unsorted) index in respect to that property. If you specify the btree keyword, the index will be a binary-tree-based sorted index in respect to that property. You may combine hash and btree properties for the same index. Specify btree for a property if you expect to perform numerical or string comparison using relational operators (<, >, >=, <=), the between or the in keyword for ranges and inverted ranges. Use hash (the default) instead of btree if you expect to perform exact comparison using =.

The create table syntax is the same for tables and named windows. The examples herein create a new UserProfileWindow named window and also use the SecuritySummaryTable table.

This sample EPL creates an non-unique index on the active column of table SecuritySummaryTable:

create index MyIndex on SecuritySummaryTable(active)

We list a few example EPL statements next that create a named window and create a single index:

// create a named window
create window UserProfileWindow.win:time(1 hour) select * from UserProfile
// create a non-unique index (duplicates allowed) for the user id property only
create index UserProfileIndex on UserProfileWindow(userId)

Next, execute an on-demand fire-and-forget query as shown below, herein we use the prepared version to demonstrate:

String query = "select * from UserProfileWindow where userId='Joe'";
EPOnDemandPreparedQuery prepared = epRuntime.prepareQuery(query);
// query performance excellent in the face of large number of rows
EPOnDemandQueryResult result = prepared.execute();
// ...later ...
prepared.execute();	// execute a second time

A unique index is generally preferable over non-unique indexes. For named windows, if your data window declares a unique data window (std:unique, std:firstunique, including intersections and grouped unique data windows) it is not necessary to create a unique index unless index sharing is enabled, since the engine considers the unique data window declaration in query planning.

The engine enforces uniqueness (e.g. unique constraint) for unique indexes. If your application inserts a duplicate row the engine raises a runtime exception when processing the statement and discards the row. The default error handler logs such an exception and continues.

For example, if the user id together with the profile id uniquely identifies an entry into the named window, your application can create a unique index as shown below:

// create a unique index on user id and profile id
create unique index UserProfileIndex on UserProfileWindow(userId, profileId)

By default, the engine builds a hash code -based index useful for direct comparison via equals (=). Filter expressions that look for ranges or use in, between do not benefit from the hash-based index and should use the btree keyword. For direct comparison via equals (=) then engine does not use btree indexes.

The next example creates a composite index over two fields symbol and buyPrice:

// create a named window
create window TickEventWindow.win:time(1 hour) as (symbol string, buyPrice double)
// create a non-unique index 
create index idx1 on TickEventWindow(symbol hash, buyPrice btree)

A sample fire-and-forget query is shown below (this time the API calls are not shown):

// query performance excellent in the face of large number of rows
select * from TickEventWindow where symbol='GE' and buyPrice between 10 and 20

Note

A table that does not declare one or more primary key columns cannot have a secondary index, as the table holds a maximum of one row.

Fire-and-Forget queries can be run against both tables and named windows. We use the term property and column interchangeably.

For selecting from named windows and tables, please see the examples in Section 15.5, “On-Demand Fire-And-Forget Query Execution”.

For data manipulation (insert, update, delete) queries, the on-demand query API returns the inserted, updated or deleted rows when the query executes against a named window.

Your application can insert rows into a table or named window using on-demand (fire-and-forget, non-continuous) queries as described in Section 15.5, “On-Demand Fire-And-Forget Query Execution”.

The engine allows the standard SQL syntax and values keyword and also supports using select to provide values.

The syntax using the values keyword is:

insert into window_or_table_name [(property_names)]
values (value_expressions)

The syntax using select is as follows:

insert into window_or_table_name [(property_names)]
select value_expressions

The window_or_table_name is the name of the table or named window to insert rows into.

After the named window or table name you can optionally provide a comma-separated list of property names.

When providing property names, the order of value expressions in the values list or select clause must match the order of property names specified. Column names provided in the select-clause, if specified, are ignored.

When not providing property names and when specifying the values keyword, the order of values must match the order of properties declared for the named window or table. When not providing property names and when specifying the select-clause, expressions must name the properties to be inserted into by assigning a column name using the as keyword.

The example code snippet inserts a new order row into the OrdersWindow named window:

String query = 
  "insert into OrdersWindow(orderId, symbol, price) values ('001', 'GE', 100)";
epService.getEPRuntime().executeQuery(query);

Instead of the values keyword you may specify a select-clause as this example shows:

String query = 
  "insert into OrdersWindow(orderId, symbol, price) select '001', 'GE', 100";
epService.getEPRuntime().executeQuery(query);

The following EPL inserts the same values as above but specifies property names as part of the select-clause expressions:

insert into OrdersWindow
select '001' as orderId, 'GE' as symbol, 100 as price

The next EPL inserts the same values as above and does not specify property names thereby populating the first 3 properties of the type of the named window:

insert into OrdersWindow values ('001', 'GE', 100)

Your application can update table and named window rows using on-demand (fire-and-forget, non-continuous) queries as described in Section 15.5, “On-Demand Fire-And-Forget Query Execution”.

The syntax for the update clause is as follows:

update window_or_table_name [as stream_name]
set mutation_expression [, mutation_expression [,...]]
[where criteria_expression]

The window_or_table_name is the name of the table or named window to remove rows from. The as keyword is also available to assign a name to the table or named window.

After the set keyword follows a comma-separated list of mutation expressions. For fire-and-forget queries the following restriction applies: Subqueries, aggregation functions and the prev or prior function may not be used in expressions. Mutation expressions are detailed in Section 6.6, “Updating Data: the On Update clause”.

The optional where clause contains a criteria_expression that identifies rows to be updated.

The example code snippet updates those rows of the named window that have a negative value for volume:

String query = "update OrdersNamedWindow set volume = 0 where volumne = 0";
epService.getEPRuntime().executeQuery(query);

To instruct the engine to use the initial property value before update, prefix the property name with the literal initial.

Your application can delete rows from a named window or table using on-demand (fire-and-forget, non-continuous) queries as described in Section 15.5, “On-Demand Fire-And-Forget Query Execution”.

The syntax for the delete clause is as follows:

delete from window_or_table_name [as stream_name]
	[where criteria_expression]

The window_or_table_name is the name of the named window or table to delete rows from. The as keyword is also available to assign a name to the named window or table.

The optional where clause contains a criteria_expression that identifies rows to be removed from the named window or table.

The example code snippet deletes from a named window all rows that have a negative value for volume:

String query = "delete from OrdersNamedWindow where volume <= 0";
epService.getEPRuntime().executeQuery(query);

As outlined in Section 2.10, “Updating, Merging and Versioning Events”, revision event types process updates or new versions of events held by a named window.

A revision event type is simply one or more existing pre-configured event types whose events are related, as configured by static configuration, by event properties that provide same key values. The purpose of key values is to indicate that arriving events are related: An event amends, updates or adds properties to an earlier event that shares the same key values. No additional EPL is needed when using revision event types for merging event data.

Revision event types can be useful in these situations:

  1. Some of your events carry only partial information that is related to a prior event and must be merged together.

  2. Events arrive that add additional properties or change existing properties of prior events.

  3. Events may carry properties that have null values or properties that do no exist (for example events backed by Map or XML), and for such properties the earlier value must be used instead.

To better illustrate, consider a revision event type that represents events for creation and updates to user profiles. Let's assume the user profile creation events carry the user id and a full profile. The profile update events indicate only the user id and the individual properties that actually changed. The user id property shall serve as a key value relating profile creation events and update events.

A revision event type must be configured to instruct the engine which event types participate and what their key properties are. Configuration is described in Section 16.4.26, “Revision Event Type” and is not shown here.

Assume that an event type UserProfileRevisions has been configured to hold profile events, i.e. creation and update events related by user id. This statement creates a named window to hold the last 1 hour of current profiles per user id:

create window UserProfileWindow.win:time(1 hour) select * from UserProfileRevisions
insert into UserProfileWindow select * from UserProfileCreation
insert into UserProfileWindow select * from UserProfileUpdate

In revision event types, the term base event is used to describe events that are subject to update. Events that update, amend or add additional properties to base events are termed delta events. In the example, base events are profile creation events and delta events are profile update events.

Base events are expected to arrive before delta events. In the case where a delta event arrives and is not related by key value to a base event or a revision of the base event currently held by the named window the engine ignores the delta event. Thus, considering the example, profile update events for a user id that does not have an existing profile in the named window are not applied.

When a base or delta event arrives, the insert and remove stream output by the named window are the current and the prior version of the event. Let's come back to the example. As creation events arrive that are followed by update events or more creation events for the same user id, the engine posts the current version of the profile as insert stream (new data) and the prior version of the profile as remove stream (old data).

Base events are also implicitly delta events. That is, if multiple base events of the same key property values arrive, then each base event provides a new version. In the example, if multiple profile creation events arrive for the same user id then new versions of the current profile for that user id are output by the engine for each base event, as it does for delta events.

The expiry policy as specified by view definitions applies to each distinct key value, or multiple distinct key values for composite keys. An expiry policy re-evaluates when new versions arrive. In the example, user profile events expire from the time window when no creation or update event for a given user id has been received for 1 hour.

Tip

It usually does not make sense to configure a revision event type without delta event types. Use the unique data window (std:unique) or unique data window in intersection with other data windows instead (i.e. std:unique(field).win:time(1 hour)).

Several strategies are available for merging or overlaying events as the configuration chapter describes in greater detail.

Any of the Map, XML and JavaBean event representations as well as plug-in event representations may participate in a revision event type. For example, profile creation events could be JavaBean events, while profile update events could be java.util.Map events.

Delta events may also add properties to the revision event type. For example, one could add a new event type with security information to the revision event type and such security-related properties become available on the resulting revision event type.

The following restrictions apply to revision event types:

  • Nested properties are only supported for the JavaBean event representation. Nested properties are not individually versioned; they are instead versioned by the containing property.

  • Dynamic, indexed and mapped properties are only supported for nested properties and not as properties of the revision event type itself.

Columns in a named window and table may also hold an event or multiple events. More information on the insert into clause providing event columns is in Section 5.10.5, “Event as a Property”.

A sample declaration for a named window and a table is:

create schema InnerData (value string)
create table ContainerTable (innerdata InnerData)
create window ContainerWindow.win:time(30) as (innerdataArray InnerData[]) // array of events

The second sample creates a named window that specifies two columns: A column that holds an OrderEvent, and a column by name priceTotal. A matching insert into statement is also part of the sample:

create window OrdersWindow.win:time(30) as select this, price as priceTotal from OrderEvent
insert into OrdersWindow select order, price * unit as priceTotal  
from ServiceOrderEvent as order

Note that the this proprerty must exist on the event and must return the event class itself (JavaBean events only). The property type of the additional priceTotal column is the property type of the existing price property.

Event patterns match when an event or multiple events occur that match the pattern's definition. Patterns can also be time-based.

Pattern expressions consist of pattern atoms and pattern operators:

  1. Pattern atoms are the basic building blocks of patterns. Atoms are filter expressions, observers for time-based events and plug-in custom observers that observe external events not under the control of the engine.

  2. Pattern operators control expression lifecycle and combine atoms logically or temporally.

The below table outlines the different pattern atoms available:


There are 4 types of pattern operators:

  1. Operators that control pattern sub-expression repetition: every, every-distinct, [num] and until

  2. Logical operators: and, or, not

  3. Temporal operators that operate on event order: -> (followed-by)

  4. Guards are where-conditions that control the lifecycle of subexpressions. Examples are timer:within, timer:withinmax and while-expression. Custom plug-in guards may also be used.

Pattern expressions can be nested arbitrarily deep by including the nested expression(s) in () round parenthesis.

Underlying the pattern matching is a state machine that transitions between states based on arriving events and based on time advancing. A single event or advancing time may cause a reaction in multiple parts of your active pattern state.

This is an example pattern expression that matches on every ServiceMeasurement events in which the value of the latency event property is over 20 seconds, and on every ServiceMeasurement event in which the success property is false. Either one or the other condition must be true for this pattern to match.

every (
  spike=ServiceMeasurement(latency>20000) 
  or 
  error=ServiceMeasurement(success=false)
)

In the example above, the pattern expression starts with an every operator to indicate that the pattern should fire for every matching events and not just the first matching event. Within the every operator in parentheses is a nested pattern expression using the or operator. The left hand of the or operator is a filter expression that filters for events with a high latency value. The right hand of the operator contains a filter expression that filters for events with error status. Filter expressions are explained in Section 7.4, “Filter Expressions In Patterns”.

The example above assigned the tags spike and error to the events in the pattern. The tags are important since the engine only places tagged events into the output event(s) that a pattern generates, and that the engine supplies to listeners of the pattern statement. The tags can further be selected in the select-clause of an EPL statement as discussed in Section 5.4.2, “Pattern-based Event Streams”.

Patterns can also contain comments within the pattern as outlined in Section 5.2.2, “Using Comments”.

Pattern statements are created via the EPAdministrator interface. The EPAdministrator interface allows to create pattern statements in two ways: Pattern statements that want to make use of the EPL select clause or any other EPL constructs use the createEPL method to create a statement that specifies one or more pattern expressions. EPL statements that use patterns are described in more detail in Section 5.4.2, “Pattern-based Event Streams”. Use the syntax as shown in below example.

EPAdministrator admin = EPServiceProviderManager.getDefaultProvider().getEPAdministrator();

String eventName = ServiceMeasurement.class.getName();

EPStatement myTrigger = admin.createEPL("select * from pattern [" +
  "every (spike=" + eventName + "(latency>20000) or error=" + eventName + "(success=false))]");

Pattern statements that do not need to make use of the EPL select clause or any other EPL constructs can use the createPattern method, as in below example.

EPStatement myTrigger = admin.createPattern(
  "every (spike=" + eventName + "(latency>20000) or error=" + eventName + "(success=false))");

Partially-completed patterns are incomplete matches that are not yet indicated by the engine because the complete pattern condition is not satisfied. Any given event can be part of multiple partially-completed patterns.

For example, consider the following pattern:

every a=A -> B and C(id=a.id)

Given this sequence of events:

A1{id='id1'}   A2{id='id2'}   B1  

According to the sequence above there are no matches. The pattern is partially completed waiting for C events. The combination {A1, B1} is waiting for a C{id='id1'} event before the pattern match is complete for that combination. The combination {A2, B1} is waiting for a C{id='id2'} event before the pattern match is complete for that combination.

Assuming event C1{id='id1') arrives the pattern outputs the combination {A1, B1, C1}. Assuming event C2{id='id2') arrives the pattern outputs the combination {A2, B1, C2}. Note that event B1 is part of both partially-completed patterns.

Use the @DiscardPartialsOnMatch pattern-level annotation to instruct the engine that when any matches occur to discard partially completed patterns that overlap in terms of the events that make up the match (or matches if there are multiple matches).

The same example using the @DiscardPartialsOnMatch pattern-level annotation is:

select * from pattern @DiscardPartialsOnMatch [every a=A -> B and C(id=a.id)]

When event C1{id='id1') arrives the pattern outputs the match combination {A1, B1, C1}. Upon indication of the match the engine discards all partially-completed patterns that refer to either of the A1, B1 and C1 events. Since event B1 is part of a partially-completed pattern waiting for C{id='id2'}, the engine discards that partially-completed pattern. Therefore when C2{id='id2'} arrives the engine outputs no matches.

When specifying both @DiscardPartialsOnMatch and @SuppressOverlappingMatches the engine discards the partially-completed patterns that overlap all matches including suppressed matches.

The operators at the top of this table take precedence over operators lower on the table.


If you are not sure about the precedence, please consider placing parenthesis () around your subexpressions. Parenthesis can also help make expressions easier to read and understand.

The following table outlines sample equivalent expressions, with and without the use of parenthesis for subexpressions.


The simplest form of filter is a filter for events of a given type without any conditions on the event property values. This filter matches any event of that type regardless of the event's properties. The example below is such a filter. Note that this event pattern would stop firing as soon as the first RfidEvent is encountered.

com.mypackage.myevents.RfidEvent

To make the event pattern fire for every RfidEvent and not just the first event, use the every keyword.

every com.mypackage.myevents.RfidEvent

The example above specifies the fully-qualified Java class name as the event type. Via configuration, the event pattern above can be simplified by using the name that has been defined for the event type.

every RfidEvent

Interfaces and superclasses are also supported as event types. In the below example IRfidReadable is an interface class, and the statement matches any event that implements this interface:

every org.myorg.rfid.IRfidReadable

The filtering criteria to filter for events with certain event property values are placed within parenthesis after the event type name:

RfidEvent(category="Perishable")

All expressions can be used in filters, including static method invocations that return a boolean value:

RfidEvent(com.mycompany.MyRFIDLib.isInRange(x, y) or (x<0 and y < 0))

Filter expressions can be separated via a single comma ','. The comma represents a logical AND between expressions:

RfidEvent(zone=1, category=10)
...is equivalent to...
RfidEvent(zone=1 and category=10)

The following set of operators are highly optimized through indexing and are the preferred means of filtering high-volume event streams:

  • equals =

  • not equals !=

  • comparison operators < , > , >=, <=

  • ranges

    • use the between keyword for a closed range where both endpoints are included

    • use the in keyword and round () or square brackets [] to control how endpoints are included

    • for inverted ranges use the not keyword and the between or in keywords

  • list-of-values checks using the in keyword or the not in keywords followed by a comma-separated list of values

At compile time as well as at run time, the engine scans new filter expressions for subexpressions that can be indexed. Indexing filter values to match event properties of incoming events enables the engine to match incoming events faster. The above list of operators represents the set of operators that the engine can best convert into indexes. The use of comma or logical and in filter expressions does not impact optimizations by the engine.

For more information on filters please see Section 5.4.1, “Filter-based Event Streams”. Contained-event selection on filters in patterns is further described in Section 5.19, “Contained-Event Selection”.

Filter criteria can also refer to events matching prior named events in the same expression. Below pattern is an example in which the pattern matches once for every RfidEvent that is preceded by an RfidEvent with the same asset id.

every e1=RfidEvent -> e2=RfidEvent(assetId=e1.assetId)

The syntax shown above allows filter criteria to reference prior results by specifying the event name tag of the prior event, and the event property name. The tag names in the above example were e1 and e2. This syntax can be used in all filter operators or expressions including ranges and the in set-of-values check:

every e1=RfidEvent -> 
  e2=RfidEvent(MyLib.isInRadius(e1.x, e1.y, x, y) and zone in (1, e1.zone))

An arriving event changes the truth value of all expressions that look for the event. Consider the pattern as follows:

every (RfidEvent(zone > 1) and RfidEvent(zone < 10))

The pattern above is satisfied as soon as only one event with zone in the interval [2, 9] is received.

An arriving event applies to all filter expressions for which the event matches. In other words, an arriving event is not consumed by any specify filter expression(s) but applies to all active filter expressions of all pattern sub-expressions.

You may provide the @consume annotation as part of a filter expression to control consumption of an arriving event. If an arriving event matches the filter expression marked with @consume it is no longer available to other filter expressions of the same pattern that also match the arriving event.

The @consume can include a level number in parenthesis. A higher level number consumes the event first. The default level number is 1. Multiple filter expressions with the same level number for @consume all match the event.

Consider the next sample pattern:

a=RfidEvent(zone='Z1') and b=RfidEvent(assetId='0001')

This pattern fires when a single RfidEvent event arrives that has zone 'Z1' and assetId '0001'. The pattern also matches when two RfidEvent events arrive, in any order, wherein one has zone 'Z1' and the other has assetId '0001'.

Mark a filter expression with @consume to indicate that if an arriving event matches multiple filter expressions that the engine prefers the marked filter expression and does not match any other filter expression.

This updated pattern statement uses @consume to indicate that a match against zone is preferred:

a=RfidEvent(zone='Z1')@consume and b=RfidEvent(assetId='0001')

This pattern no longer fires when a single RfidEvent arrives that has zone 'Z1' and assetId '0001', because when the first filter expression matches the pattern engine consumes the event. The pattern only matches when two RfidEvent events arrive in any order. One event must have zone 'Z1' and the other event must have a zone other than 'Z1' and an assetId '0001'.

The next sample pattern provides a level number for each @consume:

a=RfidEvent(zone='Z1')@consume(2) 
  or b=RfidEvent(assetId='0001')@consume(1) 
  or c=RfidEvent(category='perishable'))

The pattern fires when an RfidEvent arrives with zone 'Z1'. In this case the output event populates property 'a' but not properties 'b' and 'c'. The pattern also fires when an RfidEvent arrives with a zone other than 'Z1' and an asset id of '0001'. In this case the output event populates property 'b' but not properties 'a' and 'c'. The pattern also fires when an RfidEvent arrives with a zone other than 'Z1' and an asset id other than '0001' and a category of 'perishable'. In this case the output event populates property 'c' but not properties 'a' and 'b'.

When your filter expression provides the name of a named window then the filter expression matches each time an event is inserted into the named window that matches the filter conditions.

For example, assume a named window that holds the last order event per order id:

create window LastOrderWindow.std:unique(orderId) as OrderEvent

Assume that all order events are inserted into the named window using insert-into:

insert into LastOrderWindow select * from OrderEvent

This sample pattern fires 10 seconds after an order event with a price greater then 100 was inserted:

select * from pattern [every o=LastOrderWindow(price >= 100) -> timer:interval(10 sec)]

The pattern above fires only for events inserted-into the LastOrderWindow named window and does not fire when an order event was updated using on-update or merged using on-merge.

If your application would like to have the pattern fire for any change to the named window events including updates and merges, you must select from the named window as follows:

insert into OrderWindowChangeStream select * from LastOrderWindow
select * from pattern [every o=OrderWindowChangeStream(price >= 100) -> timer:interval(10 sec)]

A table cannot be listed as part of a pattern filter, however any filter EPL expressions can have tables access expressions and subqueries against tables.

Assuming that MyTable is a table, the following is not allowed:

// not allowed
select * from pattern [every MyTable -> timer:interval(10 sec)]

The every operator indicates that the pattern sub-expression should restart when the subexpression qualified by the every keyword evaluates to true or false. Without the every operator the pattern sub-expression stops when the pattern sub-expression evaluates to true or false.

As a side note, please be aware that a single invocation to the UpdateListener interface may deliver multiple events in one invocation, since the interface accepts an array of values.

Thus the every operator works like a factory for the pattern sub-expression contained within. When the pattern sub-expression within it fires and thus quits checking for events, the every causes the start of a new pattern sub-expression listening for more occurrences of the same event or set of events.

Every time a pattern sub-expression within an every operator turns true the engine starts a new active subexpression looking for more event(s) or timing conditions that match the pattern sub-expression. If the every operator is not specified for a subexpression, the subexpression stops after the first match was found.

This pattern fires when encountering an A event and then stops looking.

A

This pattern keeps firing when encountering A events, and doesn't stop looking.

every A

When using every operator with the -> followed-by operator, each time the every operator restarts it also starts a new subexpression instance looking for events in the followed-by subexpression.

Let's consider an example event sequence as follows.

A1   B1   C1   B2   A2   D1   A3   B3   E1   A4   F1   B4


The examples show that it is possible that a pattern fires for multiple combinations of events that match a pattern expression. Each combination is posted as an EventBean instance to the update method in the UpdateListener implementation.

Let's consider the every operator in conjunction with a subexpression that matches 3 events that follow each other:

every (A -> B -> C)

The pattern first looks for A events. When an A event arrives, it looks for a B event. After the B event arrives, the pattern looks for a C event. Finally, when the C event arrives the pattern fires. The engine then starts looking for an A event again.

Assume that between the B event and the C event a second A2 event arrives. The pattern would ignore the A2 event entirely since it's then looking for a C event. As observed in the prior example, the every operator restarts the subexpression A -> B -> C only when the subexpression fires.

In the next statement the every operator applies only to the A event, not the whole subexpression:

every A -> B -> C

This pattern now matches for each A event that is followed by a B event and then a C event, regardless of when the A event arrives. Note that for each A event that arrives the pattern engine starts a new subexpression looking for a B event and then a C event, outputting each combination of matching events.

As the introduction of the every operator states, the operator starts new subexpression instances and can cause multiple matches to occur for a single arriving event.

New subexpressions also take a very small amount of system resources and thereby your application should carefully consider when subexpressions must end when designing patterns. Use the timer:within construct and the and not constructs to end active subexpressions. The data window onto a pattern stream does not serve to limit pattern sub-expression lifetime.

Lets look at a concrete example. Consider the following sequence of events arriving:

A1   A2   B1  

This pattern matches on arrival of B1 and outputs two events (an array of length 2 if using a listener). The two events are the combinations {A1, B1} and {A2, B1}:

every a=A -> b=B

The and not operators are used to end an active subexpression.

The next pattern matches on arrival of B1 and outputs only the last A event which is the combination {A2, B1}:

every a=A -> (b=B and not A)

The and not operators cause the subexpression looking for {A1, B?} to end when A2 arrives.

Similarly, in the pattern below the engine starts a new subexpression looking for a B event every 1 second. After 5 seconds there are 5 subexpressions active looking for a B event and 5 matches occur at once if a B event arrives after 5 seconds.

every timer:interval(1 sec) -> b=B

Again the and not operators can end subexpressions that are not intended to match any longer:

every timer:interval(1 sec) -> (b=B and not timer:interval(1 sec))
// equivalent to
every timer:interval(1 sec) -> (b=B where timer:within(1 sec))

In this example we consider a generic pattern in which the pattern must match for each A event followed by a B event and followed by a C event, in which both the B event and the C event must arrive within 1 hour of the A event. The first approach to the pattern is as follows:

every A  -> (B -> C) where timer:within(1 hour)

Consider the following sequence of events arriving:

A1   A2   B1   C1   B2   C2

First, the pattern as above never stops looking for A events since the every operator instructs the pattern to keep looking for A events.

When A1 arrives, the pattern starts a new subexpression that keeps A1 in memory and looks for any B event. At the same time, it also keeps looking for more A events.

When A2 arrives, the pattern starts a new subexpression that keeps A2 in memory and looks for any B event. At the same time, it also keeps looking for more A events.

After the arrival of A2, there are 3 subexpressions active:

In the pattern above, we have specified a 1-hour lifetime for subexpressions looking for B and C events. Thus, if no B and no C event arrive within 1 hour after A1, the first subexpression goes away. If no B and no C event arrive within 1 hour after A2, the second subexpression goes away. The third subexpression however stays around looking for more A events.

The pattern as shown above thus matches on arrival of C1 for combination {A1, B1, C1} and for combination {A2, B1, C1}, provided that B1 and C1 arrive within an hour of A1 and A2.

You may now ask how to match on {A1, B1, C1} and {A2, B2, C2} instead, since you may need to correlate on a given property.

The pattern as discussed above matches every A event followed by the first B event followed by the next C event, and doesn't specifically qualify the B or C events to look for based on the A event. To look for specific B and C events in relation to a given A event, the correlation must use one or more of the properties of the A event, such as the "id" property:

every a=A -> (B(id=a.id -> C(id=a.id)) where timer:within(1 hour)

The pattern as shown above thus matches on arrival of C1 for combination {A1, B1, C1} and on arrival of C2 for combination {A2, B2, C2}.

Similar to the every operator in most aspects, the every-distinct operator indicates that the pattern sub-expression should restart when the subexpression qualified by the every-distinct keyword evaluates to true or false. In addition, the every-distinct eliminates duplicate results received from an active subexpression according to its distinct-value expressions.

The synopsis for the every-distinct pattern operator is:

every-distinct(distinct_value_expr [, distinct_value_exp[...][, expiry_time_period])

Within parenthesis are one or more distinct_value_expr expressions that return the values by which to remove duplicates.

You may optionally specify an expiry_time_period time period. If present, the pattern engine expires and removes distinct key values that are older then the time period, removing their associated memory and allowing such distinct values to match again. When your distinct value expressions return an unlimited number of values, for example when your distinct value is a timestamp or auto-increment column, you should always specify an expiry time period.

When specifying properties in the distinct-value expression list, you must ensure that the event types providing properties are tagged. Only properties of event types within filter expressions that are sub-expressions to the every-distinct may be specified.

For example, this pattern keeps firing for every A event with a distinct value for its aprop property:

every-distinct(a.aprop) a=A

Note that the pattern above assigns the a tag to the A event and uses a.prop to identify the prop property as a value of the a event A.

A pattern that returns the first Sample event for each sensor, assuming sensor is a field that returns a unique id identifying the sensor that originated the Sample event, is:

every-distinct(s.sensor) s=Sample

The next pattern looks for pairs of A and B events and returns only the first pair for each combination of aprop of an A event and bprop of a B event:

every-distinct(a.aprop, b.bprop) (a=A and b=B)

The following pattern looks for A events followed by B events for which the value of the aprop of an A event is the same value of the bprop of a B event but only for each distinct value of aprop of an A event:

every-distinct(a.aprop) a=A -> b=B(bprop = a.aprop)

When specifying properties as part of distinct-value expressions, properties must be available from tagged event types in sub-expressions to the every-distinct.

The following patterns are not valid:

// Invalid: event type in filter not tagged
every-distinct(aprop) A
			
// Invalid: property not from a sub-expression of every-distinct
a=A -> every-distinct(a.aprop) b=B

When an active subexpression to every-distinct becomes permanently false, the distinct-values seen from the active subexpression are removed and the sub-expression within is restarted.

For example, the below pattern detects each A event distinct by the value of aprop.

every-distinct(a.aprop) (a=A and not B)

In the pattern above, when a B event arrives, the subexpression becomes permanently false and is restarted anew, detecting each A event distinct by the value of aprop without considering prior values.

When your distinct key is a timestamp or other non-unique property, specify an expiry time period.

The following example returns every distinct A event according to the timestamp property on the A event, retaining each timestamp value for 10 seconds:

every-distinct(a.timestamp, 10 seconds) a=A

In the example above, if for a given A event and its timestamp value the same timestamp value occurs again for another A event before 10 seconds passed, the A event is not a match. If 10 seconds passed the pattern indicates a second match.

You may not use every-distinct with a timer-within guard to expire keys: The expiry time notation as above is the recommended means to expire keys.

// This is not the same as above; It does not expire transaction ids and is not recommended
every-distinct(a.timestamp) a=A where timer:within(10 sec)

The repeat operator fires when a pattern sub-expression evaluates to true a given number of times. The synopsis is as follows:

[match_count] repeating_subexpr

The repeat operator is very similar to the every operator in that it restarts the repeating_subexpr pattern sub-expression up to a given number of times.

match_count is a positive number that specifies how often the repeating_subexpr pattern sub-expression must evaluate to true before the repeat expression itself evaluates to true, after which the engine may indicate a match.

For example, this pattern fires when the last of five A events arrives:

[5] A

Parenthesis must be used for nested pattern sub-expressions. This pattern fires when the last of a total of any five A or B events arrives:

[5] (A or B)

Without parenthesis the pattern semantics change, according to the operator precedence described earlier. This pattern fires when the last of a total of five A events arrives or a single B event arrives, whichever happens first:

[5] A or B

Tags can be used to name events in filter expression of pattern sub-expressions. The next pattern looks for an A event followed by a B event, and a second A event followed by a second B event. The output event provides indexed and array properties of the same name:

[2] (a=A -> b=B)

Using tags with repeat is further described in Section 7.5.4.6, “Tags and the Repeat Operator”.

Consider the following pattern that demonstrates the behavior when a pattern sub-expression becomes permanently false:

[2] (a=A and not C)

In the case where a C event arrives before 2 A events arrive, the pattern above becomes permanently false.

Lets add an every operator to restart the pattern and thus keep matching for all pairs of A events that arrive without a C event in between each pair:

every [2] (a=A and not C)

Since pattern matches return multiple A events, your select clause should use tag a as an array, for example:

select a[0].id, a[1].id from pattern [every [2] (a=A and not C)]

The repeat until operator provides additional control over repeated matching.

The repeat until operator takes an optional range, a pattern sub-expression to repeat, the until keyword and a second pattern sub-expression that ends the repetition. The synopsis is as follows:

[range] repeated_pattern_expr until end_pattern_expr

Without a range, the engine matches the repeated_pattern_expr pattern sub-expression until the end_pattern_expr evaluates to true, at which time the expression turns true.

An optional range can be used to indicate the minimum number of times that the repeated_pattern_expr pattern sub-expression must become true.

The optional range can also specify a maximum number of times that repeated_pattern_expr pattern sub-expression evaluates to true and retains tagged events. When this number is reached, the engine stops the repeated_pattern_expr pattern sub-expression.

The until keyword is always required when specifying a range and is not required if specifying a fixed number of repeat as discussed in the section before.

Similar to the Java && operator the and operator requires both nested pattern expressions to turn true before the whole expression turns true (a join pattern).

This pattern matches when both an A event and a B event arrive, at the time the last of the two events arrive:

A and B

This pattern matches on any sequence of an A event followed by a B event and then a C event followed by a D event, or a C event followed by a D and an A event followed by a B event:

(A -> B) and (C -> D)

Note that in an and pattern expression it is not possible to correlate events based on event property values. For example, this is an invalid pattern:

// This is NOT valid
a=A and B(id = a.id)

The above expression is invalid as it relies on the order of arrival of events, however in an and expression the order of events is not specified and events fulfill an and condition in any order. The above expression can be changed to use the followed-by operator:

// This is valid
a=A -> B(id = a.id)
// another example using 'and'...
a=A -> (B(id = a.id) and C(id = a.id))

Consider a pattern that looks for the same event:

A and A

The pattern above fires when a single A event arrives. The first arriving A event triggers a state transition in both the left and the right hand side expression.

In order to match after two A events arrive in any order, there are two options to express this pattern. The followed-by operator is one option and the repeat operator is the second option, as the next two patterns show:

A -> A
// ... or ...
[2] A

The not operator negates the truth value of an expression. Pattern expressions prefixed with not are automatically defaulted to true upon start, and turn permanently false when the expression within turns true.

The not operator is generally used in conjunction with the and operator or subexpressions as the below examples show.

This pattern matches only when an A event is encountered followed by a B event but only if no C event was encountered before either an A event and a B event, counting from the time the pattern is started:

(A -> B) and not C

Assume we'd like to detect when an A event is followed by a D event, without any B or C events between the A and D events:

A -> (D and not (B or C))

It may help your understanding to discuss a pattern that uses the or operator and the not operator together:

a=A -> (b=B or not C)

In the pattern above, when an A event arrives then the engine starts the subexpression B or not C. As soon as the subexpression starts, the not operator turns to true. The or expression turns true and thus your listener receives an invocation providing the A event in the property 'a'. The subexpression does not end and continues listening for B and C events. Upon arrival of a B event your listener receives a second invocation. If instead a C event arrives, the not turns permanently false however that does not affect the or operator (but would end an and operator).

To test for absence of an event, use timer:interval together with and not operators. The sample statement reports each 10-second interval during which no A event occurred:

every (timer:interval(10 sec) and not A)

In many cases the not operator, when used alone, does not make sense. The following example is invalid and will log a warning when the engine is started:

// not a sensible pattern
(not a=A) -> B(id=a.id)

The followed by -> operator specifies that first the left hand expression must turn true and only then is the right hand expression evaluated for matching events.

Look for an A event and if encountered, look for a B event. As always, A and B can itself be nested event pattern expressions.

A -> B

This is a pattern that fires when 2 status events indicating an error occur one after the other.

StatusEvent(status='ERROR') -> StatusEvent(status='ERROR')

A pattern that takes all A events that are not followed by a B event within 5 minutes:

every A -> (timer:interval(5 min) and not B)

A pattern that takes all A events that are not preceded by B within 5 minutes:

every (timer:interval(5 min) and not B -> A)

The followed-by -> operator can optionally be provided with an expression that limits the number of sub-expression instances of the right-hand side pattern sub-expression.

The synopsis for the followed-by operator with limiting expression is:

lhs_expression -[limit_expression]> rhs_expression

Each time the lhs_expression pattern sub-expression turns true the pattern engine starts a new rhs_expression pattern sub-expression. The limit_expression returns an integer value that defines a maximum number of pattern sub-expression instances that can simultaneously be present for the same rhs_expression.

When the limit is reached the pattern engine issues a com.espertech.esper.client.hook.ConditionPatternSubexpressionMax notification object to any condition handlers registered with the engine as described in Section 15.12, “Condition Handling” and does not start a new pattern sub-expression instance for the right-hand side pattern sub-expression.

For example, consider the following pattern which returns for every A event the first B event that matches the id field value of the A event:

every a=A -> b=B(id = a.id)

In the above pattern, every time an A event arrives (lhs) the pattern engine starts a new pattern sub-expression (rhs) consisting of a filter for the first B event that has the same value for the id field as the A event.

In some cases your application may want to limit the number of right-hand side sub-expressions because of memory concerns or to reduce output. You may add a limit expression returning an integer value as part of the operator.

This example employs the followed-by operator with a limit expression to indicate that maximally 2 filters for B events (the right-hand side pattern sub-expression) may be active at the same time:

every a=A -[2]> b=B(id = a.id)

Note that the limit expression in the example above is not a limit per value of id field, but a limit counting all right-hand side pattern sub-expression instances that are managed by that followed-by sub-expression instance.

If your followed-by operator lists multiple sub-expressions with limits, each limit applies to the immediate right-hand side. For example, the pattern below limits the number of filters for B events to 2 and the number of filters for C events to 3:

every a=A -[2]> b=B(id = a.id) -[3]> c=C(id = a.id)

Esper allows setting a maximum number of pattern sub-expressions in the configuration, applicable to all followed-by operators of all statements.

If your application has patterns in multiple EPL statements and all such patterns should count towards a total number of pattern sub-expression counts, you may consider setting a maximum number of pattern sub-expression instances, engine-wide, via the configuration described in Section 16.4.15.1, “Followed-By Operator Maximum Subexpression Count”.

When the limit is reached the pattern engine issues a notification object to any condition handlers registered with the engine as described in Section 15.12, “Condition Handling”. Depending on your configuration the engine can prevent the start of a new pattern sub-expression instance for the right-hand side pattern sub-expression, until pattern sub-expression instances end or statements are stopped or destroyed.

The notification object issued to condition handlers is an instance of com.espertech.esper.client.hook.ConditionPatternEngineSubexpressionMax. The notification object contains information which statement triggered the limit and the pattern counts per statement for all statements.

For information on static and runtime configuration, please consult Section 16.4.15.1, “Followed-By Operator Maximum Subexpression Count”. The limit can be changed and disabled or enabled at runtime via the runtime configuration API.

Guards are where-conditions that control the lifecycle of subexpressions. Custom guard functions can also be used. The section Chapter 18, Integration and Extension outlines guard plug-in development in greater detail.

The pattern guard where-condition has no relationship to the EPL where clause that filters sets of events.

Take as an example the following pattern expression:

MyEvent where timer:within(10 sec)

In this pattern the timer:within guard controls the subexpression that is looking for MyEvent events. The guard terminates the subexpression looking for MyEvent events after 10 seconds after start of the pattern. Thus the pattern alerts only once when the first MyEvent event arrives within 10 seconds after start of the pattern.

The every keyword requires additional discussion since it also controls subexpression lifecycle. Let's add the every keyword to the example pattern:

every MyEvent where timer:within(10 sec)

The difference to the pattern without every is that each MyEvent event that arrives now starts a new subexpression, including a new guard, looking for a further MyEvent event. The result is that, when a MyEvent arrives within 10 seconds after pattern start, the pattern execution will look for the next MyEvent event to arrive within 10 seconds after the previous one.

By placing parentheses around the every keyword and its subexpression, we can have the every under the control of the guard:

(every MyEvent) where timer:within(10 sec)

In the pattern above, the guard terminates the subexpression looking for all MyEvent events after 10 seconds after start of the pattern. This pattern alerts for all MyEvent events arriving within 10 seconds after pattern start, and then stops.

Guards do not change the truth value of the subexpression of which the guard controls the lifecycle, and therefore do not cause a restart of the subexpression when used with the every operator. For example, the next pattern stops returning matches after 10 seconds unless a match occurred within 10 seconds after pattern start:

every ( (A and B) where timer:within(10 sec) )

The timer:within guard acts like a stopwatch. If the associated pattern expression does not turn true within the specified time period it is stopped and permanently false.

The synopsis for timer:within is as follows:

timer:within(time_period_expression)

The time_period_expression is a time period (see Section 5.2.1, “Specifying Time Periods”) or an expression providing a number of seconds as a parameter. The interval expression may contain references to properties of prior events in the same pattern as well as variables and substitution parameters.

This pattern fires if an A event arrives within 5 seconds after statement creation.

A where timer:within (5 seconds)

This pattern fires for all A events that arrive within 5 seconds. After 5 seconds, this pattern stops matching even if more A events arrive.

(every A) where timer:within (5 seconds)

This pattern matches for any one A or B event in the next 5 seconds.

( A or B ) where timer:within (5 sec)

This pattern matches for any 2 errors that happen 10 seconds within each other.

every (StatusEvent(status='ERROR') -> StatusEvent(status='ERROR') where timer:within (10 sec))

The following guards are equivalent:

timer:within(2 minutes 5 seconds)
timer:within(125 sec)
timer:within(125)

The timer:withinmax guard is similar to the timer:within guard and acts as a stopwatch that additionally has a counter that counts the number of matches. It ends the subexpression when either the stopwatch ends or the match counter maximum value is reached.

The synopsis for timer:withinmax is as follows:

timer:withinmax(time_period_expression, max_count_expression)

The time_period_expression is a time period (see Section 5.2.1, “Specifying Time Periods”) or an expression providing a number of seconds.

The max_count_expression provides the maximum number of matches before the guard ends the subexpression.

Each parameter expression may also contain references to properties of prior events in the same pattern as well as variables and substitution parameters.

This pattern fires for every A event that arrives within 5 seconds after statement creation but only up to the first two A events:

(every A) where timer:withinmax (5 seconds, 2)

If the result of the max_count_expression is 1, the guard ends the subexpression after the first match and indicates the first match.

This pattern fires for the first A event that arrives within 5 seconds after statement creation:

(every A) where timer:withinmax (5 seconds, 1)

If the result of the max_count_expression is zero, the guard ends the subexpression upon the first match and does no indicate any matches.

This example receives every A event followed by every B event (as each B event arrives) until the 5-second subexpression timer ends or X number of B events have arrived (assume X was declared as a variable):

every A -> (every B) where timer:withinmax (5 seconds, X)

The timer:interval pattern observer waits for the defined time before the truth value of the observer turns true. The observer takes a time period (see Section 5.2.1, “Specifying Time Periods”) as a parameter, or an expression that returns the number of seconds.

The observer may be parameterized by an expression that contains one or more references to properties of prior events in the same pattern, or may also reference variables, substitution parameters or any other expression returning a numeric value.

After an A event arrived wait 10 seconds then indicate that the pattern matches.

A -> timer:interval(10 seconds) 

The pattern below fires every 20 seconds.

every timer:interval(20 sec)

The next example pattern fires for every A event that is not followed by a B event within 60 seconds after the A event arrived. The B event must have the same "id" property value as the A event.

every a=A -> (timer:interval(60 sec) and not B(id=a.id)) 

Consider the next example, which assumes that the A event has a property waittime:

every a=A -> (timer:interval(a.waittime + 2) and not B(id=a.id))

In the above pattern the logic waits for 2 seconds plus the number of seconds provided by the value of the waittime property of the A event.

The timer:at pattern observer is similar in function to the Unix “crontab” command. At a specified time the expression turns true. The at operator can also be made to pattern match at regular intervals by using an every operator in front of the timer:at operator.

The syntax is: timer:at (minutes, hours, days of month, months, days of week [, seconds [, time zone]]).

The value for seconds and time zone is optional. Each element allows wildcard * values. Ranges can be specified by means of lower bounds then a colon ‘:’ then the upper bound. The division operator */x can be used to specify that every xth value is valid. Combinations of these operators can be used by placing these into square brackets ([]).

The timer:at observer may also be parameterized by an expression that contains one or more references to properties of prior events in the same pattern, or may also reference variables, substitution parameters or any other expression returning a numeric value. The frequency division operator */x and parameters lists within brackets ([]) are an exception: they may only contain variables, substitution parameters or numeric values.

This expression pattern matches every 5 minutes past the hour.

every timer:at(5, *, *, *, *)

The below timer:at pattern matches every 15 minutes from 8am to 5:45pm (hours 8 to 17 at 0, 15, 30 and 45 minutes past the hour) on even numbered days of the month as well as on the first day of the month.

timer:at (*/15, 8:17, [*/2, 1], *, *)

The below table outlines the fields, valid values and keywords available for each field:


The keyword last used in the days-of-month field means the last day of the month (current month). To specify the last day of another month, a value for the month field has to be provided. For example: timer:at(*, *, last,2,*) is the last day of February.

The last keyword in the day-of-week field by itself simply means Saturday. If used in the day-of-week field after another value, it means "the last xxx day of the month" - for example "5 last" means "the last Friday of the month". So the last Friday of the current month will be: timer:at(*, *, *, *, 5 last). And the last Friday of June: timer:at(*, *, *, 6, 5 last).

The keyword weekday is used to specify the weekday (Monday-Friday) nearest the given day. Variant could include month like in: timer:at(*, *, 30 weekday, 9, *) which for year 2007 is Friday September 28th (no jump over month).

The keyword lastweekday is a combination of two parameters, the last and the weekday keywords. A typical example could be: timer:at(*, *, *, lastweekday, 9, *) which will define Friday September 28th (example year is 2007).

The time zone is a string-type value that specifies the time zone of the schedule. You must specify a value for seconds when specifying a time zone. Esper relies on the java.util.TimeZone to interpret the time zone value. Note that TimeZone does not validate time zone strings.

The following timer:at pattern matches at 5:00 pm Pacific Standard Time (PST):

timer:at (0, 17, *, *, *, *, 'PST')

Any expression may occur among the parameters. This example invokes a user-defined function computeHour to return an hour:

timer:at (0, computeHour(), *, *, *, *)

The following restrictions apply to crontab parameters:

  • It is not possible to specify both Days Of Month and Days Of Week.

The timer:schedule observer is a flexible observer for scheduling.

The observer implements relevant parts of the ISO 8601 specification however it is not necessary to use ISO 8601 formats. The ISO 8601 standard is an international standard covering the exchange of date and time-related data. The standard specifies a date format, a format for time periods and a format for specifying the number of repetitions. Please find more information on ISO 8601 at Wikipedia.

The observer takes the following named parameters:


In summary, for example, the below pattern schedules two callbacks: The first callback 2008-03-01 at 13:00:00 UTC and the second callback on 2009-05-11 at 15:30:00 UTC.

select * from pattern[every timer:schedule(iso: 'R2/2008-03-01T13:00:00Z/P1Y2M10DT2H30M')]

The number of repetitions, date and period can be separated and do not have to be ISO 8601 strings, allowing each part to be an own expression.

This example specifies separate expressions. The equivalent schedule to the above example is:

select * from pattern[every timer:schedule(repetitions: 2, date: '2008-03-01T13:00:00Z', period: 1 year 2 month 10 days 2 hours 30 minutes)]

When providing the iso parameter, it must be the only parameter. The repetitions parameter is only allowed in conjunction with other parameters.

Using match recognize patterns are defined in the familiar syntax of regular expressions.

The match recognize syntax presents an alternative way to specify pattern detection as compared to the EPL pattern language described in the previous chapter. A comparison of match recognize and EPL patterns is below.

The match recognize syntax is a proposal for incorporation into the SQL standard. It is thus subject to change as the standard evolves and finalizes (it has not finalized yet). Please consult row-pattern-recogniton-11-public for further information.

You may be familiar with regular expressions in the context of finding text of interest in a string, such as particular characters, words, or patterns of characters. Instead of matching characters, match recognize matches sequences of events of interest.

Esper can apply match-recognize patterns in real-time upon arrival of new events in a stream of events (also termed incrementally, streaming or continuous). Esper can also match patterns on-demand via the iterator pull-API, if specifying a named window or data window on a stream (tables cannot be used in the from-clause with match-recognize).

This section compares pattern detection via match recognize and via the EPL pattern language.

Table 8.1. Comparison Match Recognize to EPL Patterns

CategoryEPL PatternsMatch Recognize
PurposePattern detection in sequences of events.Same.
StandardsNot standardized, similar to Rapide pattern language.Proposal for incorporation into the SQL standard.
Real-time ProcessingYes.Yes.
On-Demand query via IteratorNo.Yes.
LanguageNestable expressions consisting of boolean AND, OR, NOT and time or arrival-based constructs such as -> (followed-by), timer:within and timer:interval.Regular expression consisting of variables each representing conditions on events.
Event TypesAn EPL pattern may react to multiple different types of events.The input is a single type of event (unless used with variant streams).
Data Window InteractionDisconnected, i.e. an event leaving a data window does not change pattern state.Connected, i.e. an event leaving a data window removes the event from match selection.
Semantic EvaluationTruth-value based: A EPL pattern such as (A and B) can fire when a single event arrives that satisfies both A and B conditions.Sequence-based: A regular expression (A B) requires at least two events to match.
Time Relationship Between EventsThe timer:within, timer:interval and NOT operator can expressively search for absence of events or other more complex timing relationships.Some support for detecting absence of events using the interval clause.
ExtensibilityCustom pattern objects, user-defined functions.User-defined functions, custom aggregation functions.
Memory UseLikely between 500 bytes to 2k per open sequence, depends on pattern.Likely between 100 bytes to 1k per open sequence, depends on pattern.