Package mobi :: Package mtld :: Package da :: Package device :: Module ua_props :: Class UaProps
[hide private]
[frames] | no frames]

Class UaProps

                   object --+    
                            |    
post_walk_rules.PostWalkRules --+
                                |
                               UaProps

This class tries to extract properties from the User-Agent string itself. This is a completely separate step to the main JSON tree walk but uses the results of the tree walk to optimise the property extraction. The property extraction is done in two steps.

Step 1: Try and identify the type of User-Agent and thus the set of property extraction rules to run. This is optimised by the properties from the tree walk.

Step 2: Run the rules found in step 1 to try and extract the properties.

Nested Classes [hide private]

Inherited from post_walk_rules.PostWalkRules: PostWalkRulesException

Instance Methods [hide private]
 
__init__(self, tree_provider)
x.__init__(...) initializes x; see help(type(x)) for signature
 
put_properties(self, user_agent, props_to_vals, properties)
 
_init_get_matcher_propery_ids(self, group, prop_ids)
Find all the properties that are used for matching.
 
_init_rule_sets(self, group)
Prepare the rule set by extracting it from the current group and counting the items in the group.
 
__init_process_regexes(self)
Process the regexes by overriding any default ones with API specific regexes and then compile the list of regexes.
 
__extract_properties(self, rules_to_run, user_agent, regexes, sought, properties)
This function loops over all the rules in rules_to_run and returns any properties that match.
 
__skip_ua_rules(self, id_properties)
Check list of items that skip rules - these are typically non-mobile boolean properties such as isBrowser, isBot, isCrawler, etc.
 
__ua_property_rules(self, user_agent, id_properties, rule_groups, regexes)
Try and find a set of property extraction rules to run on the User-Agent.
 
__find_rules_by_properties(self, groups, user_agent, id_properties, regexes)
Try and find User-Agent type and thus the rules to run by using the properties returned from the tree walk.
 
__check_properties_match(self, prop_list, props_to_values)
This functions checks all the properties in the property matcher branch of this rule group.
 
__find_rules_by_regex(self, groups, user_agent, regexes)
Search for the rules to run by checking the User-Agent with a regex.
 
__find_rules_to_run_by_regex(self, user_agent, rule_set, rule_set_count, regexes, type)
Loop over a set of refining rules to try and determine the User-Agent type and so find the rules to run on it.

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Class Variables [hide private]
  API_ID = 5
  KEY_UA_RULES = u'uar'
  KEY_SKIP_IDS = u'sk'
  KEY_DEFAULT_REGEX_SET = u'd'
  KEY_RULE_GROUPS = 'rg'
  KEY_RULE_REGEX_ID = u'r'
  KEY_REGEXES = u'reg'
  KEY_REGEX_MATCH_POS = u'm'
  KEY_REFINE_REGEX_ID = u'f'
  KEY_SEARCH_REGEX_ID = u's'

Inherited from post_walk_rules.PostWalkRules: KEY_MATCHER_PROP_IDS_IN_USE, KEY_OPERATOR, KEY_PROPERTY_MATCHER, KEY_PROPERTY_VALUE, KEY_RULE_ARR, KEY_RULE_PROP_IDS_IN_USE, KEY_RULE_SET, KEY_RULE_SET_COUNT, branch, tree_provider

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, tree_provider)
(Constructor)

 

x.__init__(...) initializes x; see help(type(x)) for signature

Overrides: object.__init__
(inherited documentation)

_init_get_matcher_propery_ids(self, group, prop_ids)

 

Find all the properties that are used for matching.

Parameters:
  • group - The rule group that can contain a property matcher
  • prop_ids - The list of found property IDs
Returns:
an updated set of property IDs.
Overrides: post_walk_rules.PostWalkRules._init_get_matcher_propery_ids

_init_rule_sets(self, group)

 

Prepare the rule set by extracting it from the current group and counting the items in the group. This is done to avoid counting the items on every request.

Parameters:
  • group - The current parent group.
Returns:
a list of all rule sets.
Overrides: post_walk_rules.PostWalkRules._init_rule_sets

__init_process_regexes(self)

 

Process the regexes by overriding any default ones with API specific regexes and then compile the list of regexes. This also changes the regex key from a string to an integer for easier retrieval later on.

__extract_properties(self, rules_to_run, user_agent, regexes, sought, properties)

 

This function loops over all the rules in rules_to_run and returns any properties that match. The properties returned can be typed or strings.

Parameters:
  • rules_to_run - The rules to run against the User-Agent to find the properties.
  • user_agent - The User-Agent to find properties for.
  • regexes - The list of compiled regexes.
  • sought - A set of properites to return values for.

__skip_ua_rules(self, id_properties)

 

Check list of items that skip rules - these are typically non-mobile boolean properties such as isBrowser, isBot, isCrawler, etc.

Parameters:
  • id_properties - The results of the tree walk, map of property id to value id
Returns:
TRUE if the UA rules are to be skipped, FALSE if they have to be run

__ua_property_rules(self, user_agent, id_properties, rule_groups, regexes)

 

Try and find a set of property extraction rules to run on the User-Agent. This is done in two ways.

The first way uses properties found from the tree walk to identify the User-Agent type. If there are still multiple UA types then refining regexes can be run.

If the above approach fails to find a match then fall back to the second way which uses a more brute regex search approach.

Once the UA type is known the correct set of property extraction rules can be returned.

Parameters:
  • user_agent - The User-Agent to find properties for.
  • id_properties - The results of the tree walk, map of property id to value id.
  • rule_groups - All the rule groups that contain the matchers and the rules to run.
  • regexes - The list of compiled regexes.
Returns:
a map of rules to run against the User-Agent or None if no rules are found.

__find_rules_by_properties(self, groups, user_agent, id_properties, regexes)

 

Try and find User-Agent type and thus the rules to run by using the properties returned from the tree walk. All the properties defined in the property matcher set must match. If a match is found then the rules can be returned.

Parameters:
  • groups - The rule groups to loop over.
  • user_agent - The User-Agent to find properties for.
  • regexes - The list of compiled regexes.
Returns:
a dict of rules to run against the User-Agent or None if no rules are found.

__check_properties_match(self, prop_list, props_to_values)

 

This functions checks all the properties in the property matcher branch of this rule group. This branch contains a list of properties and their values. All must match for this function to return true.

In reality the properties and values are indexes to the main property and value arrays.

Parameters:
  • prop_list - The list of properties to check for matches.
  • props_to_values - Dict of property and value ids
Returns:
TRUE if ALL properties match, false otherwise.

__find_rules_by_regex(self, groups, user_agent, regexes)

 

Search for the rules to run by checking the User-Agent with a regex. If there is a match the rule list is returned.

Parameters:
  • groups - The rule groups to loop over.
  • user_agent - The User-Agent to find properties for.
  • regexes - The list of compiled regexes.
Returns:
a dict of rules to run against the User-Agent or nil if no rules are found.

__find_rules_to_run_by_regex(self, user_agent, rule_set, rule_set_count, regexes, type)

 

Loop over a set of refining rules to try and determine the User-Agent type and so find the rules to run on it.

Parameters:
  • user_agent - The User-Agent to find properties for.
  • rule_set - The rule_set that contains the search regex id, refine regex id and the magical rules_to_run.
  • rule_set_count - The pre-counted items in rule_set.
  • regexes - The list of compiled regexes.
  • type - The type of rule to run either Refine or Search.
Returns:
a dict of rules to run against the User-Agent or nil if no rules are found.