您现在的位置是:网站首页> 编程资料编程资料
谈谈html转义字符如何通过代码识别在线HTML转义/反转义工具HTML/XML转义字符对照表HTML字符实体(Character Entities) 转义字符串(Escape Sequence)最常用的HTML转义字符 Escape Sequencehtml中常用的转义字符总结
2023-10-10
282人已围观
简介 偶尔会在数据中看到诸如' 这样的字符,浏览器遇到这些转义符,会转义回来,但如何通过代码识别,下面就与大家探讨下html转义字符,喜欢的朋友可以了解下
偶尔会在数据中看到诸如' 这样的字符,特征如下
以开头,中间是一串数字,以;结尾
以&开头,中间一串字符,以;结尾
比如最常见的 或者等价的
浏览器遇到这些转义符,会转义回来,但如何通过代码识别? org.apache.commons.lang.StringEscapeUtils.unescapeHtml提供了很好的说明
遇到上面的第一种情况,中间是数字的,直接将数字(unicode)转为char
遇到第二情况,中间是字符,只能查映射表了,从映射表中找到字符对应的数字再转换为char 看看代码就一目了然了
看看HTML40如何定义的
复制代码
代码如下:static {
HTML40 = new Entities();
fillWithHtml40Entities(HTML40);
}
static void fillWithHtml40Entities(Entities entities) {
entities.addEntities(BASIC_ARRAY);
entities.addEntities(ISO8859_1_ARRAY);
entities.addEntities(HTML40_ARRAY);
}
再看看BASIC_ARRAY、ISO8859_1_ARRAY、HTML40_ARRAY 分别是什么
BASIC_ARRAY
复制代码
代码如下:private static final String[][] BASIC_ARRAY = {{"quot", "34"}, // " - double-quote
{"amp", "38"}, // & - ampersand
{"lt", "60"}, // < - less-than
{"gt", "62"}, // > - greater-than
};
ISO8859_1_ARRAY
复制代码
代码如下:static final String[][] ISO8859_1_ARRAY = {{"nbsp", "160"}, // non-breaking space
{"iexcl", "161"}, // inverted exclamation mark
{"cent", "162"}, // cent sign
{"pound", "163"}, // pound sign
{"curren", "164"}, // currency sign
{"yen", "165"}, // yen sign = yuan sign
{"brvbar", "166"}, // broken bar = broken vertical bar
{"sect", "167"}, // section sign
{"uml", "168"}, // diaeresis = spacing diaeresis
{"copy", "169"}, // � - copyright sign
{"ordf", "170"}, // feminine ordinal indicator
{"laquo", "171"}, // left-pointing double angle quotation mark = left pointing guillemet
{"not", "172"}, // not sign
{"shy", "173"}, // soft hyphen = discretionary hyphen
{"reg", "174"}, // � - registered trademark sign
{"macr", "175"}, // macron = spacing macron = overline = APL overbar
{"deg", "176"}, // degree sign
{"plusmn", "177"}, // plus-minus sign = plus-or-minus sign
{"sup2", "178"}, // superscript two = superscript digit two = squared
{"sup3", "179"}, // superscript three = superscript digit three = cubed
{"acute", "180"}, // acute accent = spacing acute
{"micro", "181"}, // micro sign
{"para", "182"}, // pilcrow sign = paragraph sign
{"middot", "183"}, // middle dot = Georgian comma = Greek middle dot
{"cedil", "184"}, // cedilla = spacing cedilla
{"sup1", "185"}, // superscript one = superscript digit one
{"ordm", "186"}, // masculine ordinal indicator
{"raquo", "187"}, // right-pointing double angle quotation mark = right pointing guillemet
{"frac14", "188"}, // vulgar fraction one quarter = fraction one quarter
{"frac12", "189"}, // vulgar fraction one half = fraction one half
{"frac34", "190"}, // vulgar fraction three quarters = fraction three quarters
{"iquest", "191"}, // inverted question mark = turned question mark
{"Agrave", "192"}, // � - uppercase A, grave accent
{"Aacute", "193"}, // � - uppercase A, acute accent
{"Acirc", "194"}, // � - uppercase A, circumflex accent
{"Atilde", "195"}, // � - uppercase A, tilde
{"Auml", "196"}, // � - uppercase A, umlaut
{"Aring", "197"}, // � - uppercase A, ring
{"AElig", "198"}, // � - uppercase AE
{"Ccedil", "199"}, // � - uppercase C, cedilla
{"Egrave", "200"}, // � - uppercase E, grave accent
{"Eacute", "201"}, // � - uppercase E, acute accent
{"Ecirc", "202"}, // � - uppercase E, circumflex accent
{"Euml", "203"}, // � - uppercase E, umlaut
{"Igrave", "204"}, // � - uppercase I, grave accent
{"Iacute", "205"}, // � - uppercase I, acute accent
{"Icirc", "206"}, // � - uppercase I, circumflex accent
{"Iuml", "207"}, // � - uppercase I, umlaut
{"ETH", "208"}, // � - uppercase Eth, Icelandic
{"Ntilde", "209"}, // � - uppercase N, tilde
{"Ograve", "210"}, // � - uppercase O, grave accent
{"Oacute", "211"}, // � - uppercase O, acute accent
{"Ocirc", "212"}, // � - uppercase O, circumflex accent
{"Otilde", "213"}, // � - uppercase O, tilde
{"Ouml", "214"}, // � - uppercase O, umlaut
{"times", "215"}, // multiplication sign
{"Oslash", "216"}, // � - uppercase O, slash
{"Ugrave", "217"}, // � - uppercase U, grave accent
{"Uacute", "218"}, // � - uppercase U, acute accent
{"Ucirc", "219"}, // � - uppercase U, circumflex accent
{"Uuml", "220"}, // � - uppercase U, umlaut
{"Yacute", "221"}, // � - uppercase Y, acute accent
{"THORN", "222"}, // � - uppercase THORN, Icelandic
{"szlig", "223"}, // � - lowercase sharps, German
{"agrave", "224"}, // � - lowercase a, grave accent
{"aacute", "225"}, // � - lowercase a, acute accent
{"acirc", "226"}, // � - lowercase a, circumflex accent
{"atilde", "227"}, // � - lowercase a, tilde
{"auml", "228"}, // � - lowercase a, umlaut
{"aring", "229"}, // � - lowercase a, ring
{"aelig", "230"}, // � - lowercase ae
{"ccedil", "231"}, // � - lowercase c, cedilla
{"egrave", "232"}, // � - lowercase e, grave accent
{"eacute", "233"}, // � - lowercase e, acute accent
{"ecirc", "234"}, // � - lowercase e, circumflex accent
{"euml", "235"}, // � - lowercase e, umlaut
{"igrave", "236"}, // � - lowercase i, grave accent
{"iacute", "237"}, // � - lowercase i, acute accent
{"icirc", "238"}, // � - lowercase i, circumflex accent
{"iuml", "239"}, // � - lowercase i, umlaut
{"eth", "240"}, // � - lowercase eth, Icelandic
{"ntilde", "241"}, // � - lowercase n, tilde
{"ograve", "242"}, // � - lowercase o, grave accent
{"oacute", "243"}, // � - lowercase o, acute accent
{"ocirc", "244"}, // � - lowercase o, circumflex accent
{"otilde", "245"}, // � - lowercase o, tilde
{"ouml", "246"}, // � - lowercase o, umlaut
{"divide", "247"}, // division sign
{"oslash", "248"}, // � - lowercase o, slash
{"ugrave", "249"}, // � - lowercase u, grave accent
{"uacute", "250"}, // � - lowercase u, acute accent
{"ucirc", "251"}, // � - lowercase u, circumflex accent
{"uuml", "252"}, // � - lowercase u, umlaut
{"yacute", "253"}, // � - lowercase y, acute accent
{"thorn", "254"}, // � - lowercase thorn, Icelandic
{"yuml", "255"}, // � - lowercase y, umlaut
};
HTML40_ARRAY
复制代码
代码如下:static final String[][] HTML40_ARRAY = {
//
{"fnof", "402"}, // latin small f with hook = function= florin, U+0192 ISOtech -->
//
{"Alpha", "913"}, // greek capital letter alpha, U+0391 -->
{"Beta", "914"}, // greek capital letter beta, U+0392 -->
{"Gamma", "915"}, // greek capital letter gamma,U+0393 ISOgrk3 -->
{"Delta", "916"}, // greek capital letter delta,U+0394 ISOgrk3 -->
{"Epsilon", "917"}, // greek capital letter epsilon, U+0395 -->
{"Zeta", "918"}, // greek capital letter zeta, U+0396 -->
{"Eta", "919"}, // greek capital letter eta, U+0397 -->
{"Theta", "920"}, // greek capital letter theta,U+0398 ISOgrk3 -->
{"Iota", "921"}, // greek capital letter iota, U+0399 -->
{"Kappa", "922"}, // greek capital letter kappa, U+039A -->
{"Lambda", "923"}, // greek capital letter lambda,U+039B ISOgrk3 -->
{"Mu", "924"}, // greek capital letter mu, U+039C -->
{"Nu", "925"}, // greek capital letter nu, U+039D -->
{"Xi", "926"}, // greek capital letter xi, U+039E ISOgrk3 -->
{"Omicron", "927"}, // greek capital letter omicron, U+039F -->
{"Pi", "928"}, // greek capital letter pi, U+03A0 ISOgrk3 -->
{"Rho", "929"}, // greek capital letter rho, U+03A1 -->
//
{"Sigma", "931"}, // greek capital letter sigma,U+03A3 ISOgrk3 -->
{"Tau", "932"}, // greek capital letter tau, U+03A4 -->
{"Upsilon", "933"}, // greek capital letter upsilon,U+03A5 ISOgrk3 -->
{"Phi", "934"}, // greek capital letter phi,U+03A6 ISOgrk3 -->
{"Chi", "935"}, // greek capital letter chi, U+03A7 -->
{"Psi", "936"}, // greek capital letter psi,U+03A8 ISOgrk3 -->
{"Omega", "937"}, // greek capital letter omega,U+03A9 ISOgrk3 -->
{"alpha", "945"}, // greek small letter alpha,U+03B1 ISOgrk3 -->
{"beta", "946"}, // greek small letter beta, U+03B2 ISOgrk3 -->
{"gamma", "947"}, // greek small letter gamma,U+03B3 ISOgrk3 -->
{"delta", "948"}, // greek small letter delta,U+03B4 ISOgrk3 -->
{"epsilon", "949"}, // greek small letter epsilon,U+03B5 ISOgrk3 -->
{"zeta", "950"}, // greek small letter zeta, U+03B6 ISOgrk3 -->
{"eta", "951"}, // greek small letter eta, U+03B7 ISOgrk3 -->
{"theta", "952"}, // greek small letter theta,U+03B8 ISOgrk3 -->
{"iota", "953"}, // greek small letter iota, U+03B9 ISOgrk3 -->
{"kappa", "954"}, // greek small letter kappa,U+03BA ISOgrk3 -->
{"lambda", "955"}, // greek small letter lambda,U+03BB ISOgrk3 -->
{"mu", "956"}, // greek small letter mu, U+03BC ISOgrk3 -->
{"nu", "957"}, // greek small letter nu, U+03BD ISOgrk3 -->
{"xi", "958"}, // greek small letter xi, U+03BE ISOgrk3 -->
{"omicr
相关内容
- a标签中写有文字并有图片如何隐藏文字只显示图片纯html+css实现Element loading效果纯html+css实现奥运五环的示例代码HTML+CSS实现导航条下拉菜单的示例代码html+css实现滚动到元素位置显示加载动画效果纯html+css实现打字效果html+css实现环绕倒影加载特效html输入两个数实现加减乘除功能html中显示特殊符号(附带特殊字符对应表)关于html选择框创建占位符的问题html css3不拉伸图片显示效果
- Html 元素隐藏的几种方式纯html+css实现Element loading效果纯html+css实现奥运五环的示例代码HTML+CSS实现导航条下拉菜单的示例代码html+css实现滚动到元素位置显示加载动画效果纯html+css实现打字效果html+css实现环绕倒影加载特效html输入两个数实现加减乘除功能html中显示特殊符号(附带特殊字符对应表)关于html选择框创建占位符的问题html css3不拉伸图片显示效果
- Html注释 Html中标记文字注释的符号【HTML 元素】标记文字详解详解HTML编程的标记与文档结构css 滤镜效果主要对HTML标记设置滤镜效果html body标签详解与html常用的控制标记 html入门教程 html标记符号快速掌握HTML5 新旧语法标记对我们有什么好处Html长文本超出标记宽度后自动截取实现代码html中的meta标记简单对照浅谈HTML中的标记
- HTML网页头部代码实例详解CSS失效怎么办?请检查网页最头部是否包含Doctype标签HTML 网页头部代码全清楚网页头部css代码优化实例网页头部优化建议
- html中的div、td 、p 等容器内强制换行和不换行的实现纯html+css实现Element loading效果纯html+css实现奥运五环的示例代码HTML+CSS实现导航条下拉菜单的示例代码html+css实现滚动到元素位置显示加载动画效果纯html+css实现打字效果html+css实现环绕倒影加载特效html输入两个数实现加减乘除功能html中显示特殊符号(附带特殊字符对应表)关于html选择框创建占位符的问题html css3不拉伸图片显示效果
- 天天酷跑攻略 天天酷跑安卓版无限暴走刷分刷金币视频教程_手机游戏_游戏攻略_
- 天天酷跑刷金币刷分无限漂移详细图文攻略_手机游戏_游戏攻略_
- 天天酷跑ios版卡顿的快速解决办法 iphone 5s天天酷跑卡顿怎么办_手机游戏_游戏攻略_
- 天天酷跑72w刷分不成功的解决方法 天天酷跑刷分失败_手机游戏_游戏攻略_
- 天天酷跑刷72W心得经验 天天酷跑刷分72万最新教程亲测无压力_手机游戏_游戏攻略_
点击排行
本栏推荐
