Charset Declaration Test
The Character Set Test verifies that any URL declares its character encoding correctly via <meta charset="utf-8"> in the <head> and the matching Content-Type response header. UTF-8 is the universal modern standard — anything else risks broken accents, mojibake (garbled characters), and inconsistent behaviour across browsers. Declaring the encoding early in <head> tells the browser to parse the rest of the document correctly.
What This Tool Checks
- <meta charset="utf-8"> in <head>
- Content-Type response header with charset
- Charset declared within the first 1024 bytes
- BOM presence and impact
- Mismatch between declared and actual encoding
Why It Matters for SEO
The wrong character encoding produces garbled text — accents, emoji, non-Latin scripts all break visibly. Modern browsers default to UTF-8 if no encoding is declared, but declaring it explicitly is required for predictable behaviour across all browsers and for the encoding sniffer to settle on UTF-8 rather than guessing. The cost of getting this wrong is broken text users see; the cost of fixing it is a single meta tag.
How to Fix It
Add <meta charset="utf-8"> as the first child of <head> on every page. Set the Content-Type response header to text/html; charset=utf-8 at the server. Save all source files as UTF-8 without BOM. Re-test until both declarations agree and content renders correctly.
How It Works
We fetch the URL, parse the response Content-Type header, locate any <meta charset> in <head>, and verify the encoding actually used to write the response matches the declarations. UTF-8 with no BOM is the recommended baseline.
Common Mistakes to Avoid
- No charset declaration at all (browser must guess)
- Charset declared but file actually saved as ISO-8859-1
- Charset declared after the first 1024 bytes (too late, browser may pick wrong)
- Conflicting charsets between meta tag and HTTP header
- BOM bytes confusing some parsers
Quick Checklist
- <meta charset="utf-8"> as first <head> child
- Content-Type header includes charset=utf-8
- All source files saved as UTF-8 without BOM
- Special characters and emoji render correctly
- No mojibake on any page