Class KeywordSet
This class provides a lightweight, immutable container optimized for fast keyword lookups using binary search. Keywords are stored in a sorted array, making lookups O(log n) efficient. This is particularly useful for parsers, compilers, and syntax highlighters that need to frequently check if identifiers are reserved keywords.
Features:
- Immutable after construction - thread-safe for concurrent reads
- Efficient O(log n) lookups using binary search
- Compact memory footprint using sorted array
- Automatic rejection of null and single-character strings
Implementation Details:
Keywords are sorted lexicographically during construction and stored in an internal array. The contains(String)
method uses Arrays.binarySearch(Object[], Object) for efficient lookups. Strings shorter than 2 characters
are automatically rejected without performing a search, as most programming languages don't have single-character
keywords.
Examples:
Use Cases:
- Programming language parsers - checking if tokens are reserved words
- Syntax highlighters - identifying keywords for special formatting
- Code analyzers - distinguishing keywords from identifiers
- Template engines - recognizing template keywords
- Query languages - validating reserved words
Notes:
-
This class is immutable and thread-safe after construction. Multiple threads can safely call
contains(String)concurrently. - Keywords are compared using exact string matching (case-sensitive). For case-insensitive matching, normalize your keywords and input strings to the same case.
- The minimum keyword length is 2 characters. Single-character strings are automatically rejected.
-
Consider creating
KeywordSetinstances as static final constants to avoid repeated construction.
See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionKeywordSet(String... keywords) Creates a new keyword set with the specified keywords. -
Method Summary
Modifier and TypeMethodDescriptionbooleanChecks if the specified string is a keyword in this set.booleanCompares the specified object with this keyword set for equality.inthashCode()Returns the hash code value for this keyword set.toString()Returns a string representation of this keyword set.
-
Constructor Details
-
KeywordSet
Creates a new keyword set with the specified keywords.Keywords are automatically sorted during construction. Duplicate keywords are allowed but provide no benefit. For best performance, pass unique keywords.
Example:
// Create a keyword set for SQL keywords KeywordSetsql =new KeywordSet("SELECT" ,"FROM" ,"WHERE" ,"INSERT" ,"UPDATE" ,"DELETE" ,"CREATE" ,"DROP" ,"TABLE" ,"INDEX" );// Keywords can be passed in any order KeywordSetkeywords =new KeywordSet("zebra" ,"apple" ,"banana" );assertTrue (keywords .contains("apple" ));// Sorted internally - Parameters:
keywords- The keywords to store. Can be empty but notnull . Individual keywords can be any non-null string.
-
-
Method Details
-
contains
Checks if the specified string is a keyword in this set.This method performs an O(log n) binary search on the sorted keyword array. Null strings and strings with fewer than 2 characters are automatically rejected without performing a search.
Example:
KeywordSet
keywords =new KeywordSet("class" ,"interface" ,"enum" );// Standard checks assertTrue (keywords .contains("class" ));// Keyword exists assertFalse (keywords .contains("MyClass" ));// Not a keyword // Edge cases handled gracefully assertFalse (keywords .contains(null ));// null returns false assertFalse (keywords .contains("" ));// Empty string returns false assertFalse (keywords .contains("a" ));// Single char returns false // Case-sensitive matching assertTrue (keywords .contains("class" ));assertFalse (keywords .contains("CLASS" ));// Different case Performance:
- Time complexity: O(log n) using binary search
- Space complexity: O(1) - no additional memory allocated
- Short-circuit: Strings with length < 2 return immediately without searching
- Parameters:
s- The string to check. Can benull .- Returns:
true if the string exists in this keyword set,false if it doesn't exist, isnull , or has fewer than 2 characters.
-
toString
Returns a string representation of this keyword set.The format follows the standard Java set convention:
"[keyword1, keyword2, ...]" -
equals
Compares the specified object with this keyword set for equality.Returns
true if the given object is also a keyword set and contains the same keywords in the same order. Two keyword sets are equal if their internal arrays are equal. -
hashCode
Returns the hash code value for this keyword set.The hash code is computed from the internal array of keywords using
Arrays.hashCode(Object[]). This ensures thatks1.equals(ks2) implies thatks1.hashCode()==ks2.hashCode() for any two keyword setsks1 andks2 , as required by the general contract ofObject.hashCode().
-