@@ -18,114 +18,149 @@ DESCRIPTION
18
18
top level <style> declarations.
19
19
20
20
METHODS
21
- new
22
- Instantiates the Inliner object. Sets up class variables that are
23
- used during file parsing/processing. Possible options are:
21
+ new
22
+ Instantiates the Inliner object. Sets up class variables that are used
23
+ during file parsing/processing. Possible options are:
24
24
25
- entities - (optional) Pass in a string containing characters to
26
- entity encode in all output, overrides the internal default provided
27
- by the module
25
+ html_tree - (optional) Pass in a fresh unparsed instance of
26
+ HTML::Treebuilder
28
27
29
- html_tree - (optional) Pass in a fresh unparsed instance of
30
- HTML::Treebuilder
28
+ NOTE: Any passed references to HTML::TreeBuilder will be substantially
29
+ altered by passing it in here...
31
30
32
- NOTE: Any passed references to HTML::TreeBuilder will be
33
- substantially altered by passing it in here...
31
+ strip_attrs - (optional) Remove all "id" and "class" attributes during
32
+ inlining
34
33
35
- strip_attrs - (optional) Remove all "id" and "class" attributes
36
- during inlining
34
+ leave_style - (optional) Leave style/link tags alone within <head>
35
+ during inlining
37
36
38
- leave_style - (optional) Leave style/link tags alone within <head>
39
- during inlining
37
+ relaxed - (optional) Relaxed HTML parsing which will attempt to
38
+ interpret non-HTML4 documents.
40
39
41
- relaxed - (optional) Relaxed HTML parsing which will attempt to
42
- interpret non-HTML4 documents.
40
+ NOTE: This argument is not compatible with passing an html_tree.
43
41
44
- NOTE: This argument is not compatible with passing an html_tree.
42
+ agent - (optional) Pass in a string containing a preferred user-agent,
43
+ overrides the internal default provided by the module for handling
44
+ remote documents
45
45
46
- agent - (optional) Pass in a string containing a preferred
47
- user-agent, overrides the internal default provided by the module
48
- for handling remote documents
46
+ fetch_file
47
+ Fetches a remote HTML file that supposedly contains both HTML and a
48
+ style declaration, properly tags the data with the proper characterset
49
+ as provided by the remote webserver (if any). Subsequently calls the
50
+ read method automatically.
49
51
50
- fetch_file
51
- Fetches a remote HTML file that supposedly contains both HTML and a
52
- style declaration, properly tags the data with the proper
53
- characterset as provided by the remote webserver (if any).
54
- Subsequently calls the read method automatically.
52
+ This method expands all relative urls, as well as fully expands the
53
+ stylesheet reference within the document.
55
54
56
- This method expands all relative urls, as well as fully expands the
57
- stylesheet reference within the document.
55
+ This method requires you to pass in a params hash that contains a url
56
+ argument for the requested document. For example:
58
57
59
- This method requires you to pass in a params hash that contains a
60
- url argument for the requested document. For example:
58
+ $self->fetch_file({ url => 'http://www.example.com' });
61
59
62
- $self->fetch_file({ url => 'http://www.example.com' });
60
+ Note that you can specify a user-agent to override the default
61
+ user-agent of 'Mozilla/4.0' within the constructor. Doing so may avoid
62
+ certain issues with agent filtering related to quirky webserver configs.
63
63
64
- Note that you can specify a user-agent to override the default
65
- user-agent of 'Mozilla/4.0' within the constructor. Doing so may
66
- avoid certain issues with agent filtering related to quirky
67
- webserver configs.
64
+ Input Parameters: url - the desired url for a remote asset presumably
65
+ containing both html and css charset - (optional) programmer specified
66
+ charset for the pass url
68
67
69
- read_file
70
- Opens and reads an HTML file that supposedly contains both HTML and
71
- a style declaration. It subsequently calls the read() method
72
- automatically.
68
+ read_file
69
+ Opens and reads an HTML file that supposedly contains both HTML and a
70
+ style declaration. It subsequently calls the read() method
71
+ automatically.
73
72
74
- This method requires you to pass in a params hash that contains a
75
- filename argument. For example:
73
+ This method requires you to pass in a params hash that contains a
74
+ filename argument. For example:
76
75
77
- $self->read_file({ filename => 'myfile.html' });
76
+ $self->read_file({ filename => 'myfile.html' });
78
77
79
- Additionally you can specify the character encoding within the file,
80
- for example:
78
+ Additionally you can specify the character encoding within the file, for
79
+ example:
81
80
82
- $self->read_file({ filename => 'myfile.html', charset => 'utf8' });
81
+ $self->read_file({ filename => 'myfile.html', charset => 'utf8' });
83
82
84
- read
85
- Reads passed html data and parses it. The intermediate data is
86
- stored in class variables.
83
+ Input Parameters: filename - name of local file presumably containing
84
+ both html and css charset - (optional) programmer specified charset of
85
+ the passed file
87
86
88
- The <style> block is ripped out of the html here, and stored
89
- separately. Class/ID/Names used in the markup are left alone.
87
+ read
88
+ Reads passed html data and parses it. The intermediate data is stored in
89
+ class variables.
90
90
91
- This method requires you to pass in a params hash that contains
92
- scalar html data. For example:
91
+ The <style> block is ripped out of the html here, and stored separately.
92
+ Class/ID/Names used in the markup are left alone.
93
93
94
- $self->read({ html => $html });
94
+ This method requires you to pass in a params hash that contains scalar
95
+ html data. For example:
95
96
96
- NOTE: You are required to pass a properly encoded perl reference to
97
- the html data. This method does *not* do the dirty work of encoding
98
- the html as utf8 - do that before calling this method.
97
+ $self->read({ html => $html });
99
98
100
- inlinify
101
- Processes the html data that was entered through either 'read' or
102
- 'read_file', returns a scalar that contains a composite chunk of
103
- html that has inline styles instead of a top level <style>
104
- declaration.
99
+ NOTE: You are required to pass a properly encoded perl reference to the
100
+ html data. This method does *not* do the dirty work of encoding the html
101
+ as utf8 - do that before calling this method.
105
102
106
- query
107
- Given a particular selector return back the applicable styles
103
+ Input Parameters: html - scalar presumably containing both html and css
104
+ charset - (optional) scalar representing the original charset of the
105
+ passed html
108
106
109
- specificity
110
- Given a particular selector return back the associated selectivity
107
+ detect_charset
108
+ Detect the charset of the passed content.
111
109
112
- content_warnings
113
- Return back any warnings thrown while inlining a given block of
114
- content.
110
+ The algorithm present here is roughly based off of the HTML5 W3C working
111
+ group document, which lays out a recommendation for determining the
112
+ character set of a received document, which can be seen here under the
113
+ "determining the character encoding" section:
114
+ http://www.w3.org/TR/html5/syntax.html
115
115
116
- Note: content warnings are initialized at inlining time, not at read
117
- time. In order to receive back content feedback you must perform
118
- inlinify first
116
+ Input Parameters: content - scalar presumably containing both html and
117
+ css charset - (optional) programmer specified charset for the passed
118
+ content ctcharset - (optional) content-type specified charset for
119
+ content retrieved via a url
120
+
121
+ decode_characters
122
+ Implement the character decoding algorithm for HTML as outlined by the
123
+ various working groups
124
+
125
+ Basically apply best practices for determining the applied character
126
+ encoding and properly decode it
127
+
128
+ It is expected that this method will be called before any calls to
129
+ read()
130
+
131
+ Input Parameters: content - scalar presumably containing both html and
132
+ css charset - known charset for the passed content
133
+
134
+ inlinify
135
+ Processes the html data that was entered through either 'read' or
136
+ 'read_file', returns a scalar that contains a composite chunk of html
137
+ that has inline styles instead of a top level <style> declaration.
138
+
139
+ query
140
+ Given a particular selector return back the applicable styles
141
+
142
+ specificity
143
+ Given a particular selector return back the associated selectivity
144
+
145
+ content_warnings
146
+ Return back any warnings thrown while inlining a given block of content.
147
+
148
+ Note: content warnings are initialized at inlining time, not at read
149
+ time. In order to receive back content feedback you must perform
150
+ inlinify first
119
151
120
152
Sponsor
121
153
This code has been developed under sponsorship of MailerMailer LLC,
122
154
http://www.mailermailer.com/
123
155
124
156
AUTHOR
125
-
157
+
126
158
127
159
CONTRIBUTORS
128
-
160
+
161
+
162
+ Michael Peters <
[email protected] >
163
+
129
164
130
165
LICENSE
131
166
This module is Copyright 2015 Khera Communications, Inc. It is licensed
0 commit comments