Kompyuta, Mapulogalamu
UTF-8 - khalidwe kabisidwe
Unicode amathandiza pafupifupi onse alipo waika khalidwe. The bwino maonekedwe a kabisidwe Unicode akonzedwa khalidwe UTF-8 kabisidwe. Iwo amathandiza ngakhale ndi ASCII, kukana kuti kupotozedwa deta, dzuwa ndi chomasuka processing. Koma zinthu oyamba.
wolemba pulogalamu mawonekedwe
Makompyuta ntchito osati monga manambala umboni zinthu masamu, komanso osakaniza mayunitsi yosungirako akagwira deta atathana kukula - mamvekedwe ndi mawu 32-bit. Kabisidwe muyezo ayenera kutenga zimenezo akamaona mmene kukapereka chiwerengero cha anthu otchulidwa.
Mu makompyuta, ndi integers amasungidwa maselo chikumbukiro 8 Akamva (1 mamvekedwe), 16 kapena 32 Tinthu. Aliyense mtundu amati Unicode kabisidwe, lomwe ndondomeko ya maselo kukumbukira ndi inteja lolingana chizindikiro makamaka. Mu muyezo pali njira zitatu zosiyana za wolemba pulogalamu Unicode otchulidwa 8, 16 ndi 32-bit atilakwira. Chotero, zikudziwika monga UTF-8, UTF 16 ndi UTF 32. Dzina UTF amayimira Unicode Transformation Format. Aliyense wa njira zitatu kabisidwe njira ndi wofanana chifaniziro Unicode khalidwe ubwino mafomu osiyanasiyana.
Data kubisa angagwiritsidwe ntchito kuimira anthu onse mu muyezo Unicode. Choncho, iwo ali ndi n'zogwirizana zothetsera kwa zifukwa zosiyanasiyana, ntchito zosiyanasiyana za wolemba pulogalamu. Aliyense 'kala' ikusonyeza akhoza unambiguously n'kukhala zina ziwiri popanda imfa deta.
nenalozheniya mfundo
Aliyense wa mitundu Unicode kabisidwe yapangidwa view sanali tsankho alipo. Mwachitsanzo, Windows-932 ndipamene mawonekedwe a mabayiti imodzi kapena ziwiri code. Akuyambira kutalika zimadalira mamvekedwe woyamba, kutsogolera mamvekedwe makhalidwe mndandanda wa ziwiri mamvekedwe ndi umodzi mamvekedwe disjoint. Komabe, phindu la mamvekedwe limodzi ndi trailing mamvekedwe ndondomeko mwina mwamalunji. Izi zikutanthauza kuti chitsanzo kuti khalidwe kufufuza D (kachidindo 44) angazimvetse molakwika kulowa gawo lachiwiri la ndondomeko ya ziwiri mamvekedwe khalidwe "D" (kachidindo 84 44). Kuti tipeze omwe akuyambira zolondola, pulogalamu ayenera kuganizira mabayiti m'mbuyomu.
Zilinso zovuta, ngati kutsogolera ndi trailing mabayiti machesi. Izi zikutanthauza kuti kuchotsa ambiguity adzakhala osakira n'zosiyana asanafike chiyambi cha malemba kapena wapadera malamulo zinayendera. Izi si inefficient, koma si kutetezedwa zolakwa nkotheka, kuyambira chimodzi chokha cholakwika mamvekedwe lembalo zonse wakhala unreadable.
Mtundu kutembenuka Unicode amapewa vutoli chifukwa mtengo wa akutsogolera trailing, ndipo gulu limodzi yosungirako si mfundo zofanana. Izi zipangitsa kuti onse Unicode yofunafuna ndi kufanizitsa, musamatope zotsatira olakwika chifukwa mwangozi mbali zosiyanasiyana za malamulo khalidwe. chakuti mitundu imeneyi ya wolemba pulogalamu kusunga nenalozheniya mfundo chimasiyanitsa iwo kwa ena East Asia encodings Mipikisano mamvekedwe.
Mbali ina nonintersection Unicode encodings n'zakuti khalidwe ali ndi malire zodziwika bwino. Izi kumatha kufunika kwa aone ndi kalekale chiwerengero cha zizindikiro m'mbuyomu. Mbali imeneyi limatchedwa kudziletsa clocking kabisidwe. Kupotozedwa mayunitsi Ndondomekozi atchule kupotoza khalidwe limodzi lokha, ndipo otchulidwa ozungulira adakali wawo. Mu kutembenuka 8-bit mtundu, ngati Cholozera mfundo mamvekedwe a kuyambira 10xxxxxx (code bayinare) kupeza chiyambi cha chizindikiro chofunika kuti munthu zitatu kusintha iwowo.
kugwirizana
Unicode Consortium limathandiza uliwonse 3 encodings. Nkofunika kukaniza UTF-8 ndi Unicode, monga akamagwiritsa onse kutembenuka - ofunikanso mitundu ya chimake cha Unicode-kabisidwe khalidwe muyezo.
Mamvekedwe-lathu
Kuimira otchulidwa UTF 32 kukhala ndi 32-bit kachidindo wagawo, amene chikugwirizana ndi malamulo Unicode. UTF 16 - wina magawo awiri 16-bit. A UTF-8 amagwiritsa mpaka 4 mabayiti.
UTF-8 kabisidwe lakonzedwa kuti n'zogwirizana ndi machitidwe mamvekedwe wokonda ASCII ofotokoza. Ambiri mapulogalamu alipo ndi mchitidwe wa luso nkhani kwa nthawi yaitali anadalira chifaniziro cha anthu otchulidwa mu ndondomeko ya mabayiti. ndondomeko angapo zimadalira mosalekeza wa kabisidwe ASCII ndipo amagwiritsa kapena amapewa Makhalidwe wapadera kulamulira. A njira yosavuta kuti azolowere zinthu Unicode mungathe, ntchito 8-bit 'kala' ikusonyeza chifukwa chonamizira otchulidwa Unicode, aliyense wofanana ASCII chikhalidwe kapena khalidwe kulamulira. Kuti zimenezi zitheke, ndipo UTF-8 kabisidwe.
variable kutalika
UTF-8 - wolemba pulogalamu ya kutalika variable, wopangidwa mwa mayunitsi yosungirako 8-bit, ndi Tinthu chapamwamba zomwe zikusonyeza kuti mbali ya ndondomeko ya lililonse mamvekedwe munthu uli. Mmodzi osiyanasiyana mfundo analigawira chinthu choyamba akuyambira code, wina - lotsatira. Izi zimathandiza disjointness kabisidwe.
ASCII
zizindikiro UTF-8 kabisidwe imayendetsedwa ASCII (0x00-0x7F). Izi zikutanthauza kuti zilembo Unicode U + 0000-U + 007F mtima mu umodzi mamvekedwe 0x00-0x7F UTF-8 ndipo adzakhala osiyana ASCII. Komanso kupewa ambiguity, phindu 0x00-0x7F si ntchito kenanso mmodzi mamvekedwe chifaniziro cha zilembo Unicode. Kuti encode zizindikiro neideograficheskih ena kuposa ASCII, ntchito ndondomeko ya mabayiti awiri. Zizindikiro osiyanasiyana monga U + 0800-U + FFFF awa akuimira mabayiti atatu, ndi zizindikiro zina ndi kuposa U + FFFF amafuna mabayiti anayi.
dera ntchito
UTF-8 kabisidwe zambiri wapatsidwa mmalo mwa protocol HTML, ndi zina zotero.
XML wakhala muyezo woyamba ndi thandizo lathunthu UTF-8 kabisidwe. mabungwe mfundo komanso amalangiza. Support vuto adiresi ulalo chosiyana ndi ASCII-otchulidwa, linathera pamene Consortium W3C ndi IETF zomangamanga gulu anabwera pangano pa 'kala' ikusonyeza onse maadiresi ulalo okha basi UTF-8.
Ngakhale ndi ASCII facilitates kusintha kwa mapulogalamu atsopano. Ndi UTF-8 ntchito kwambiri akonzi malemba, kuphatikizapo JEdit, Emacs, BBEdit, Eclipse, ndi "kope" Mawindo dongosolo ntchito. Palibe mtundu wina wa kabisidwe Unicode sangathe kudzitama wa thandizo lotereli wa chida.
wolemba pulogalamu phindu kuti imakhala ndi ndondomeko ya mabayiti. Ndi UTF-8 chingwe n'zosavuta ntchito C ndi zinenero zina mapulogalamu. Izi ndi mtundu okha wa kabisidwe, dongosolo sikutanthauza zolemba mabayiti BOM kapena kulengeza kabisidwe mu XML.
kudziletsa kalunzanitsidwe
Pamalo amagwiritsa zizindikiro 8-bit wa processing poyerekeza ndi akanema ena Mipikisano mamvekedwe khalidwe, UTF-8 Ubwino:
- Woyamba mamvekedwe malamulo ndondomeko kakunena za m'litali. Izi kumawonjezera dzuwa la kusaka mwachindunji.
- Anasintha kupeza chiyambi cha chizindikiro monga poyambira mamvekedwe yochepa kuti osiyanasiyana enieni mfundo.
- No mphambano mamvekedwe mfundo.
Yerekezerani ubwino
UTF-8 kabisidwe yaying'ono. Koma pamene ntchito kabisidwe East otchulidwa Asian (Chinese, Japanese, Korean, kulemba Chinese ntchito zizindikiro) ntchito mindandanda 3-mamvekedwe. Komanso UTF-8 kabisidwe Mat zina wolemba pulogalamu processing liwiro. A bayinare kusanja mizere ambabala chifukwa mofanana bayinare kusanja Unicode.
Chiwembu khalidwe kabisidwe
Chiwembu khalidwe kabisidwe limapangidwa kabisidwe zizindikiro mawonekedwe ndi njira osakwatiwa mamvekedwe mayunitsi malo code. Kudziwa chiwembu kabisidwe Unicode muyezo amapereka ntchito koyamba mamvekedwe kuti chizindikiro (BOM, mamvekedwe kuti chizindikiro).
Pamene BOM mu UTF-8 Mbali opatsidwa malire ndi umboni wa ntchito zosonyeza wolemba pulogalamu. Mavuto kudziŵa endian UTF-8 nacho ake kabisidwe wagawo kukula ndi mamvekedwe wina. Ntchito BOM kwa mawonekedwe ya 'kala' ikusonyeza silofunika chofunika kapena wodwala. BOM kumachitika m'lembali akatembenuka ku codings ena ntchito mamvekedwe kuti chizindikiro kapena siginecha kwa UTF-8 kabisidwe. Ndi ndondomeko ya 3 mabayiti EF BB 16 16 BF 16.
Kodi kukhazikitsa UTF-8 kabisidwe
The HTML wolemba pulogalamu UTF-8 waikidwa ndi malamulo otsatirawa:
mutu
Pambuyo HTTP-equiv = "Timasangalala-Type" okhutira = "lemba / HTML; charset = utf-8" ˃
Mu Php UTF-8 kabisidwe walembedwa ntchito chamutu () ntchito pa chiyambi cha wapamwamba akakhala zolakwa linanena bungwe mlingo mtengo:
˂? Php
error_reporting (-1);
chamutu ( "Timasangalala-Type: lemba / HTML; charset = utf-8);
Kulumikiza ku Nawonso achichepere MySQL UTF-8 kabisidwe wayikidwa:
˂? Php
mysql_set_charset ( 'utf8');
The CSS-file kabisidwe ndi zilembo UTF-8 ndi chinaneneratu motere:
@charset "utf-8";
Pamene inu kupulumutsa owona a mitundu yonse kusankha UTF-8 kabisidwe popanda BOM, mwinamwake malo sizigwira ntchito. Kuti muchite izi DreamWeave ayenera kusankha menyu item "zosintha - Page katundu - Title / kabisidwe" kusintha kabisidwe kuti UTF-8. Otsatidwa ndi kutsegulaso tsamba, kuchotsa chongani ku "Connect Unicode siginecha (BOM)» ndi ntchito kusintha. Ngati mawu aliwonse pa tsamba kapena Nawonso achichepere ndi polima mtundu wina wa 'kala' ikusonyeza, m'pofunika kuti adzathe kulowa kapena kukonzanso encode. Pamene inu ntchito mawu zonse, onetsetsani kuti ntchito modifier u.
Mukhozanso kusunga kaundula mu UTF-8 kabisidwe mu "kope" mawindo. Pambuyo kusankha menyu item "Buku - Sungani Pamene ..." kukhazikitsa mawonekedwe kofunika kabisidwe ndi kupulumutsa file mu UTF-8.
Mu mutu mkonzi kope ++, ngati anapereka ena kuposa UTF-8 kudzera ku menyu item "Sinthani kukhala UTF-8 popanda BOM» kusintha khalidwe ndi kusunga mu UTF-8.
palibe njira
Mu nkhani ya kudalirana kumene malire ndale ndi zinenero ali achotsa, khalidwe waika kuti ndi makhalidwe m'dera lawo, phindu lochepa. Unicode ndi umodzi khalidwe akonzedwa kuti muzilowa localizations onse. A UTF-8 - chitsanzo cha kukhazikitsa moyenera Unicode, ndilo:
- Iwo amathandiza ndi zipangizo zosiyanasiyana, kuphatikizapo ngakhale ndi kabisidwe ASCII;
- Ndi kugonjetsedwa deta kupotoza;
- yosavuta komanso zothandiza mu mankhwala;
- ndi nsanja palokha.
Mkubwela kwa UTF-8 mtsutso zimene mtundu wa kabisidwe kapena khalidwe akonzedwa bwino, amakhala odalirika.
Similar articles
Trending Now