I'm writing a Python CSS-selector library that allows one to write these kinds of expressions in Python as a pet project. The goal of the library is to represent selectors in a flat, intuitive and interesting way; all valid syntax defined by the Selectors Level 4 Draft must be supported, in one way or another.
# lorem|foo.bar[baz^="qux"]:has(:invalid)::first-line selector = (Namespace('lorem') | Tag('foo')) \ .bar \ # Can also be written as [Attribute('baz').starts_with('qux')] [Attribute('baz', '^=', 'qux')] \ # '>>' is used instead of ' '. [:'has', (Selector.SELF >> PseudoClass('invalid'),)] \ [::'first-line']
Here's how the hierachy looks like (/ signifies an alias, () mixin superclasses):
Selector(ABC) # Enum too? ├── PseudoElement ├── ComplexSelector(Sequence[CompoundSelector | Combinator]) ├── CompoundSelector(Sequence[SimpleSelector]) ├── SimpleSelector │ ├── TypeSelector / Tag │ ├── UniversalSelector │ ├── AttributeSelector / Attribute │ ├── ClassSelector / Class │ ├── IDSelector / ID │ └── PseudoClass ├── SELF / PseudoClass('scope') └── ALL / UniversalSelector() Combinator ├── ChildCombinator: '__gt__' / '>' ├── DescendantCombinator: '__rshift__' / '>>' ├── NamespaceSeparator: '__or__' / '|' ├── NextSiblingCombinator: '__add__' / '+' ├── SubsequentSiblingsCombinator: '__sub__' / '-' └── ColumnCombinator: '__floordiv__' / '//'
This design has some disadvantages:
The replacements of combinators
- Descendant combinator (
) → right shift (>>) - Column combinator (
||) → floor division (//) - Subsequent-siblings combinator (
~) → minus/subtract (-)
>> and // are currently not valid combinators, but may be in the future. The last is much safer, since - is already considered a valid character for <ident-token>s.
Functional pseudo-classes needs a comma between its name (a string/non-callable) and its arguments (a tuple):
[:'where', (Class('foo'), Class('bar'))]
Those disadvantages might need to be considered while modifying the design around the limitations:
- HTML classes with hyphens cannot be added with Python dotted attribute syntax (
.foo-bar); not to mention, this also means that any classes that implement this syntax using __getattr__/__getattribute__ won't be able to have any methods. - Currently there is no way to add an ID in the middle of a compound selector. Since Python doesn't have a
# operator I'm at a loss. I have thought about overloading __call__ but Tag('foo').bar('baz') or Tag('foo')[Attribute('qux')]('baz') would look too much like a normal method call.
How should I go about working around these limitations?