5
\$\begingroup\$

I have the following Ruby code, which parses an XML document sax style (it's a very simplified version):

require 'nokogiri'

class ParsingError < RuntimeError; end

class Parser
  def initialize mapper, collector, error_collector
    @error_collector = error_collector
    @sax_handler = Handler.new(collector, @error_collector, mapper)
  end

  def process io_stream
    parser = Nokogiri::XML::SAX::Parser.new(@sax_handler)
    parser.parse_io(io_stream, 'UTF-8')
  rescue ParsingError => e
    @error_collector << "ParsingError: #{e.message}"
  end
end

class Handler < Nokogiri::XML::SAX::Document
  def initialize collector, error_collector, mapper
    @mapper = mapper
    @collector = collector
    @error_collector = error_collector
  end

  def start_element name, attrs=[]
    return if name=='nodes'
    @current_element = @mapper.new(name)
  end

  def end_element name, attrs=[]
    return if @current_element.nil?
    data = @current_element.get_errors
    if data[:errors].empty?
      @collector << @current_element.result
    else
      @error_collector << data
    end
    @current_element = nil
  end

  def characters string
    @current_element.record(string) if @current_element
  end

  def error message
    raise ParsingError, message
  end
end

class Mapper
  def initialize name
    @data = ""
    @name = name
  end

  def record str
    @data << str
  end

  def result
    {name: @name, data: @data} # here are more properties
  end

  def get_errors
    arr = []
    if @data.index("Error 123")
      arr << "Error Code 123"
    elsif @data.index("Error 456")
      arr << "Error Code 456"
    elsif @data.index("Error 789")
      arr << "Error Code 789"
      # and a lot more validations
    end
    {name: @name, errors: arr}
  end
end

# Example usage
xml = %{<?xml version="1.0" encoding="UTF-8"?>
  <nodes>
    <node1>some more data</node1>
    <node2>data Error 123</node2>
    <malformed xml
  </nodes>
}
collector = []
error_collector = []
mapper = Mapper
parser = Parser.new(Mapper, collector, error_collector)
parser.process(StringIO.new(xml))
puts "Data:"
puts collector
puts
puts "Errors:"
puts error_collector

My problem is the data collection and error handling.

  1. I'm passing two arrays around, collector and error_collector, which doesn't seem right. I want to collect data and errors in one place, but the collector itself should simply implement a << method (like an Array). Any advice how to improve it?
  2. Generally, the error handling feels weird. I need it in two places: Logging parsing errors and validations on items. To identify the items I also need the name of a node, except when it's a parsing error. So there are two kinds of errors: Parsing and node validation errors. How could I improve this code and remove the code smells?
  3. Any other advice how to improve the code above?
\$\endgroup\$

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.