95992828九五至尊2

plus规则管理类FilterManager九五至尊1老品牌值得,plus规则达成

一月 28th, 2019  |  九五至尊1老品牌值得

Adblock plus的文档
http://adblockplus.org/en/documentation
这边介绍了很多新闻,其中
http://adblockplus.org/en/faq\_internal\#filters
介绍了什么样高效搜索规则,我也如约那种办法完成了一个HashMap来管理这么些规则,

上一篇说了adbock
plus的合营规则,可是有部分平整必要任何音信,而不是简简单单的url字符串就可以拍卖了。

#ifndef FILTERMANAGER_H

比如域消息,像third-party规则等。

#define FILTERMANAGER_H

当下只打算支持script,image,stylesheet,third-party,domain规则

#include “PlatformString.h”

个中script,image,stylesheet规则通过url字符串的中的文件扩张名来匹配,

#include <wtf/Vector.h>

third-party,domain则借助KURL类举行分析,

#include “StringHash.h”

于是布置新的FilterRule接口如下:

#include <wtf/HashMap.h>

#ifndef FILTER_H

#include <wtf/HashSet.h>

#define FILTER_H

#include “KURL.h”

#include “PlatformString.h”

//#define ADB_NO_QT_DEBUG

#include “FilterManager.h”

namespace WebCore {

#include <wtf/Vector.h>

       /*

#include <wtf/HashMap.h>

     匹配类型,最近临时只帮忙,script,image,stylesheet,以及third_party,

#include “KURL.h”

        */

namespace WebCore {

       #define
FILTER_TYPE_SCRIPT 0x0001

       class FilterRule {

       #define
FILTER_TYPE_IMAGE 0X0002

       public:

       #define
FILTER_TYPE_BACKGROUND 0x0004

              /*

       #define
FILTER_TYPE_STYLESHEET 0X0008

               首先manager已经判定过是过滤而不是隐藏规则了

       #define
FILTER_TYPE_OBJECT 0X0010

               以@@开端,则是白名单,manager会优先考虑

       #define
FILTER_TYPE_XBL 0X0020 //不会帮助

               以||起初则是不匹配协议名的过滤,并去掉||

       #define
FILTER_TYPE_PING 0X0040

               以|开首,则去掉|,否则在开始处添加*

       #define
FILTER_TYPE_XMLHTTPREQUEST 0x0080

               含有$类型指定规则,去掉这个字符串,并拍卖项目

       #define
FILTER_TYPE_OBJECT_SUBREQUEST 0X0100

               以|结尾,去掉|,否则在结尾处添加*

       #define
FILTER_TYPE_DTD 0X0200

               */

       #define
FILTER_TYPE_SUBDOCUMENT 0X0400

              FilterRule( const
String & rule);

       #define
FILTER_TYPE_DOCUMENT 0X0800

              /*

       #define
FILTER_TYPE_ELEMHIDE 0X1000

               是不是相应过滤,即使是白名单,匹配则应该只是滤,否则过滤

       #define
FILTER_TYPE_THIRD_PARTY 0x2000

               类型用于只过滤adlbock plus规则中指定的项目。

//     #define
FILTER_TYPE_DOMAIN 0X4000

               */

//     #define
FILTER_TYPE_MATCH_CASE 0X8000

        bool shouldFilter(const
KURL & mainURL,const KURL & url, FilterType t);

//     #define
FILTER_TYPE_COLLAPSE 0x10000

              //是不是是白名单

       typedef unsigned int
FilterType;

        bool isWhiteFilter() {
return m_isException;}

       typedef
Vector<String> StringVector;

              //是或不是是通过项目来展开过滤,比如只过滤脚本等。那几个可能须求广大信息,暂时不予考虑,比如domain类型过滤,

       class FilterRule;

        bool isNeedMimeType() {
return m_type!=0;}

       class HideRule;

        const String &
getRegularFilter(){ return m_reFilter;}

       class FilterRuleList;

        const String &
getWholeRule() { return m_rule;}

       class HideRuleList;

              //inline const
StringVector &  constantsForFastSearch() {return constants;}

       //只应该有一个实例,

              void print();

       /*

       private:

        那里须求考虑的是保证此类是多线程安全的,正常查询可以确保

        bool m_isException; //
start with @@ //白名单

        只是动态删除以及丰硕时怎么保险三十二线程安全,内部适用map来管理各个条条框框

        bool
m_isMatchProtocol;

        或者hash来管理。

              /*

        */

               adblock rule
describe in regular expression

       class FilterManager {

               */

              //typedef
HashMap<String,FilterRuleList* , CaseFoldingHash > FilterRuleMap;

        String m_reFilter;

              typedef
HashMap<String,HideRuleList* ,CaseFoldingHash> HideRuleMap;

              //StringVector
constants;

              typedef
Vector<FilterRule *> FilterRuleVector;

        String m_rule;

              class
FilterRuleMap: public HashMap<String,FilterRuleList* ,
CaseFoldingHash > {

              /*

            HashSet<unsigned
int > unMatchRules;

               Type options:
determine which types of elements a filter can block (or whitelist in
case of an exception rule). Multiple type options can be specified to
indicate that the filter should be applied to several types of elements.
Possible types are:

              public:

               */

                    
~FilterRuleMap();

        FilterType m_type;

             //prepare to start
find

        /*

            inline void
prepareStartFind() { this->unMatchRules.clear();}

Restriction to
third-party/first-party requests: If the third-party option is

            // release resource

specified, the filter is only
applied to requests from a different origin

            //inline void
endFind() {}

than the currently viewed page.
Similarly, ~third-party restricts the filter

            bool doFilter(const
KURL & mainURL,const String & key,const KURL & url,FilterType t);

to requests from the same
origin as the currently viewed page.

              };

         */

       private:

        bool
m_filterThirdParty;

              HideRuleMap
hiderules;

        bool
m_matchFirstParty;

              FilterRuleMap
m_ShortcutWhiteRules; //white list, can use shortcut

        /*

              FilterRuleVector
m_UnshortcutWhiteRules;

Domain restrictions: The option
domain=example.com means that the filter

              FilterRuleMap
m_ShortcutFilterRules;

should only be applied on pages
from “example.com” domain. Multiple domains

              FilterRuleVector
m_UnshortcutFilterRules;

can be specified using “|” as
separator: with the option

              FilterRuleVector
m_AllFilterRules;

domain=example.com|example.net
the filter will only be applied on pages from

             
Vector<HideRule * > m_AllHideRules;

“example.com” or “example.net”
domains. If a domain name is preceded with

       private:

“~”, the filter should not be
applied on pages from this domain. For example,

              /*

domain=~example.com means that
the filter should be applied on pages from any

               从文件读取规则,string假若有qt的盈盈共享就好了,webkit使用的string

domain but “example.com” and
domain=example.com|~foo.example.com restricts

               就是富含共享,可以平素传值

the filter to the “example.com”
domain with the exception of

               */

“foo.example.com” subdomain.

             
FilterManager(const String & filename);

         */

              //规则集合

        Vector<String>
m_domains;

             
FilterManager(const StringVector & rules);

        Vector<String>
m_inverseDomains;

       public:

    private:

              static
FilterManager* getManager(const String & filename);

        bool isMatchType(const
KURL & url,FilterType t);

              static
FilterManager * getManager(const StringVector & rules);

        bool
isMatchThirdParty(const KURL & host,const KURL & other);

              ~FilterManager();

        bool isMatchDomains(
const KURL & url);

              bool
addRule(String rule);

        bool
processDomains(String & ds);

              //哪个规则,运行时不可能隐藏,只可以删除

       };

              bool hideRule(int
id);

       //隐藏规则,含有##的规则

              /*

       class HideRule {

               是还是不是应当过滤,

       public:

               方今暂不考虑类型匹配,因为类型新闻无法取得

              /*

               因为许多规则不可以肯定精晓,比如background,必须来自css的请求,方今不能确知

               将##事先的字符串解析为一组域名,前边的原封不动,作为css选取器来处理。

               */

               */

              /*

              HideRule(const
String & r);

               * Besides of
translating filters into regular expressions Adblock Plus also

              //隐藏规则适用的domain。假设为空,则适用于所有,否则只适用于指明的domain

tries to extract text
information from them. What it needs is a unique

              const
StringVector & domains();

string of eight characters (a
“shortcut”) that must be present in every

             
//example.com,~foo.example.com##*.sponsor

address matched by the filter
(the length is arbitrary, eight just seems

             
//*.sponsor就是selector

reasonable here). For example,
if you have a filter |http://ad.\* then

              const String &
selector();

Adblock Plus has the choice
between “http://a”, “ttp://ad” and “tp://ad.”,

              void print();

any of these strings will
always be present in whatever this filter will

       private:

match. Unfortunately finding a
shortcut for filters that simply don’t have

              String m_sel;

eight characters unbroken by
wildcards or for filters that have been

              StringVector
m_domains;

specified as regular
expressions is impossible.

       };

All shortcuts are put into a
lookup table, Adblock Plus can find the filter

}

by its shortcut very
efficiently. Then, when a specific address has to be

#endif // FILTER_H

tested Adblock Plus will first
look for known shortcuts there (this can be

 

done very fast, the time needed
is almost independent from the number of

shortcuts). Only when a
shortcut is found the string will be tested against

the regular expression of the
corresponding filter. However, filters

without a shortcut still have
to be tested one after another which is slow.

To sum up: which filters should
be used to make a filter list fast? You

should use as few regular
expressions as possible, those are always slow.

You also should make sure that
simple filters have at least eight

characters of unbroken text
(meaning that these don’t contain any

characters with a special
meaning like *), otherwise they will be just as

slow as regular expressions.
But with filters that qualify it doesn’t

matter how many filters you
have, the processing time is always the same.

That means that if you need 20
simple filters to replace one regular

expression then it is still
worth it. Speaking of which — the deregifier is

very recommendable.

               */

        bool shouldFilter(const
KURL & mainURL,const KURL & url, FilterType t=0);

              //使用webkit内部的指针管理方法来保管重返值?

              //依照域名来确定适用的css规则,假设不帮助的css规则,暂时忽略.

              String
cssrules(const String & domain);

       private:

              void
addRule(FilterRule * r);

              void
addRule(HideRule * r);

       };

}

#endif // FILTERMANAGER_H

 

#ifndef FILTERMANAGER_H

#define FILTERMANAGER_H

#include “PlatformString.h”

#include <wtf/Vector.h>

#include “StringHash.h”

#include <wtf/HashMap.h>

#include <wtf/HashSet.h>

#include “KURL.h”

//#define ADB_NO_QT_DEBUG

namespace WebCore {

       /*

     匹配类型,目前临时只支持,script,image,stylesheet,以及third_party,

        */

       #define
FILTER_TYPE_SCRIPT 0x0001

       #define
FILTER_TYPE_IMAGE 0X0002

       #define
FILTER_TYPE_BACKGROUND 0x0004

       #define
FILTER_TYPE_STYLESHEET 0X0008

       #define
FILTER_TYPE_OBJECT 0X0010

       #define
FILTER_TYPE_XBL 0X0020 //不会辅助

       #define
FILTER_TYPE_PING 0X0040

       #define
FILTER_TYPE_XMLHTTPREQUEST 0x0080

       #define
FILTER_TYPE_OBJECT_SUBREQUEST 0X0100

       #define
FILTER_TYPE_DTD 0X0200

       #define
FILTER_TYPE_SUBDOCUMENT 0X0400

       #define
FILTER_TYPE_DOCUMENT 0X0800

       #define
FILTER_TYPE_ELEMHIDE 0X1000

       #define
FILTER_TYPE_THIRD_PARTY 0x2000

//     #define
FILTER_TYPE_DOMAIN 0X4000

//     #define
FILTER_TYPE_MATCH_九五至尊1老品牌值得,CASE 0X8000

//     #define
FILTER_TYPE_COLLAPSE 0x10000

       typedef unsigned int
FilterType;

       typedef
Vector<String> StringVector;

       class FilterRule;

       class HideRule;

       class FilterRuleList;

       class HideRuleList;

       //只应该有一个实例,

       /*

        那里须要考虑的是承保此类是二十八线程安全的,正常查询可以保障

        只是动态删除以及丰硕时怎么着保管八线程安全,内部适用map来管理各个条条框框

        或者hash来管理。

        */

       class FilterManager {

              //typedef
HashMap<String,FilterRuleList* , CaseFoldingHash > FilterRuleMap;

              typedef
HashMap<String,HideRuleList* ,CaseFoldingHash> HideRuleMap;

              typedef
Vector<FilterRule *> FilterRuleVector;

              class
FilterRuleMap: public HashMap<String,FilterRuleList* ,
CaseFoldingHash > {

            HashSet<unsigned
int > unMatchRules;

              public:

                    
~FilterRuleMap();

             //prepare to start
find

            inline void
prepareStartFind() { this->unMatchRules.clear();}

            // release resource

            //inline void
endFind() {}

            bool doFilter(const
KURL & mainURL,const String & key,const KURL & url,FilterType t);

              };

       private:

              HideRuleMap
hiderules;

              FilterRuleMap
m_ShortcutWhiteRules; //white list, can use shortcut

              FilterRuleVector
m_UnshortcutWhiteRules;

              FilterRuleMap
m_ShortcutFilterRules;

              FilterRuleVector
m_UnshortcutFilterRules;

              FilterRuleVector
m_AllFilterRules;

             
Vector<HideRule * > m_AllHideRules;

       private:

              /*

               从文件读取规则,string若是有qt的涵盖共享就好了,webkit使用的string

               就是包涵共享,可以一向传值

               */

             
FilterManager(const String & filename);

              //规则集合

             
FilterManager(const StringVector & rules);

       public:

              static
FilterManager* getManager(const String & filename);

              static
FilterManager * getManager(const StringVector & rules);

              ~FilterManager();

              bool
addRule(String rule);

              //哪个规则,运行时不可以隐藏,只可以删除

              bool hideRule(int
id);

              /*

               是不是应该过滤,

               方今暂不考虑类型匹配,因为类型音讯不能取得

               因为众多平整不可能肯定了然,比如background,必须来自css的伸手,如今不能确知

               */

              /*

               * Besides of
translating filters into regular expressions Adblock Plus also

tries to extract text
information from them. What it needs is a unique

string of eight characters (a
“shortcut”) that must be present in every

address matched by the filter
(the length is arbitrary, eight just seems

reasonable here). For example,
if you have a filter |http://ad.\* then

Adblock Plus has the choice
between “http://a”, “ttp://ad” and “tp://ad.”,

any of these strings will
always be present in whatever this filter will

match. Unfortunately finding a
shortcut for filters that simply don’t have

eight characters unbroken by
wildcards or for filters that have been

specified as regular
expressions is impossible.

All shortcuts are put into a
lookup table, Adblock Plus can find the filter

by its shortcut very
efficiently. Then, when a specific address has to be

tested Adblock Plus will first
look for known shortcuts there (this can be

done very fast, the time needed
is almost independent from the number of

shortcuts). Only when a
shortcut is found the string will be tested against

the regular expression of the
corresponding filter. However, filters

without a shortcut still have
to be tested one after another which is slow.

To sum up: which filters should
be used to make a filter list fast? You

should use as few regular
expressions as possible, those are always slow.

You also should make sure that
simple filters have at least eight

characters of unbroken text
(meaning that these don’t contain any

characters with a special
meaning like *), otherwise they will be just as

slow as regular expressions.
But with filters that qualify it doesn’t

matter how many filters you
have, the processing time is always the same.

That means that if you need 20
simple filters to replace one regular

expression then it is still
worth it. Speaking of which — the deregifier is

very recommendable.

               */

        bool shouldFilter(const
KURL & mainURL,const KURL & url, FilterType t=0);

              //使用webkit内部的指针管理艺术来管理再次来到值?

              //依据域名来确定适用的css规则,即使不帮衬的css规则,暂时忽略.

              String
cssrules(const String & domain);

       private:

              void
addRule(FilterRule * r);

              void
addRule(HideRule * r);

       };

}

#endif // FILTERMANAGER_H

  

相关文章

Your Comments

近期评论

    功能


    网站地图xml地图