0

I'm trying to build a Japanese study guide. I downloaded a huge Japanese English dictionary file in XML from here. It's about 3,000,000 lines of code.

Here is an example entry from the file:

<JMdict>
    <entry>
        <ent_seq>1000110</ent_seq>
        <k_ele>
            <keb>CDプレーヤー</keb>
            <ke_pri>spec1</ke_pri>
        </k_ele>
        <k_ele>
            <keb>CDプレイヤー</keb>
        </k_ele>
        <r_ele>
            <reb>シーディープレーヤー</reb>
            <re_restr>CDプレーヤー</re_restr>
            <re_pri>spec1</re_pri>
        </r_ele>
        <r_ele>
            <reb>シーディープレイヤー</reb>
            <re_restr>CDプレイヤー</re_restr>
        </r_ele>
        <sense>
            <pos>&n;</pos>
            <gloss>CD player</gloss>
        </sense>
    </entry>
</JMdict>

I'm not too familiar with how to use XML. I want to be able to search through the file and return the entry information. <keb> and <reb> are going to be the Japanese terms, and inside <gloss> is the English meaning. There are multiple <keb> and <reb> because there are multiple ways to say and spell the same word. If someone could just tell me a simple way to search typing the word in English in an input box and search each <entry> for a match in <gloss> using regex that would be enough for me to get the project rolling. I just want to type an English word and return the Japanese equivalent.

2
  • If I remember correctly, this is very easy with jQuery. You just pass it your XML, it becomes a jQuery object. Something like var $xml = $(xmldata); $xml.find("ent_seq"); $xml.find("entry").find("re_pri") Commented Jun 9, 2017 at 14:17
  • Possible duplicate of how to extract values from an XML document using Javascript and here Commented Jun 9, 2017 at 14:19

1 Answer 1

0
var xml = document.createElement("div");
xml.innerHTML = "<JMdict><entry><ent_seq>1000110</ent_seq><k_ele><keb>CDプレーヤー</keb><ke_pri>spec1</ke_pri></k_ele><k_ele><keb>CDプレイヤー</keb></k_ele><r_ele><reb>シーディープレーヤー</reb><re_restr>CDプレーヤー</re_restr><re_pri>spec1</re_pri></r_ele><r_ele><reb>シーディープレイヤー</reb><re_restr>CDプレイヤー</re_restr></r_ele><sense><pos>&n;</pos><gloss>CD player</gloss></sense></entry></JMdict>"

//xml.innerHTML = url xml

xml.querySelector("keb").innerText // will give value in first keb
xml.querySelector("reb").innerText // will give you first reb
xml.querySelector("gloss").innerText // will give you gloss

use querySelectorAll("keb")[0/1] to access specific keb.
Or write any query you want like
xml.querySelector("k_ele keb") // keb inside k_ele
xml.querySelector("r_ele re_pri").innerText // returns "spec1"
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.