3

I'm developing a program that scrapes the web for certain data and feeds it back to the database. The problem is that I don't want duplicate entries of the same data as soon as the crawlers run for a second time. If some attributes changed, but the majority of the data is still the same, I'd like to update the DB entry rather than simply adding a new one. I know how to do this in code, but I was wondering if this could be done better.

The way the update works right now:

//This method calls several other methods to check if the event in question already exists. If it does, it updates it using the id it returns. 
//If it doesn't exist, -1 is returned as an id.
public static void check_event(Event event)
{
    int id = -1;

    id = check_exact_event(event); //Check if an event exists with the same title, location and time.
    if(id > 0)
    {
        update_event(event, id);
        Logger.log("EventID #" + id + " found using exact comparison");
        return;
    }

    id = check_similar_event_titles(event); //Check if event exists with a different (but similar) title
    if(id > 0)
    {
        update_event(event, id);
        Logger.log("EventID #" + id + " found using similar title comparison");
        return;
    }

    id = check_exact_image(event); //Check if event exists with the exact same image
    if(id > 0)
    {
        update_event(event, id);
        Logger.log("EventID #" + id + " found using image comparison");
        return;
    }

    //Otherwise insert new event
    create_new_event(event);
}

This works, but it's not very pleasing to the eye. What's the best way to go about this?

7
  • 1
    Looks OK to me, I've wrote code like this before. Commented May 17, 2016 at 9:21
  • 2
    In Java, convention is to use camelCase (update_event -> updateEvent, check_event -> checkEvent etc). And it's not that bad as code. Commented May 17, 2016 at 9:26
  • @T.Claverie I'll keep that in mind, thanks! That's just for method names right? Commented May 17, 2016 at 9:28
  • 2
    This question may be more appropriate on Code Review. Commented May 17, 2016 at 9:35
  • In theory if you move your log to method which gets you event, you could achieve it in single if statement, code would be shorter, but want be cleaner Commented May 17, 2016 at 9:39

1 Answer 1

3

Personally i can'tsee anything wron with your code, it is clean and effective. If you really want to change it, you could do it in single if statement

public static void check_event(Event event) {
        int id = -1;

        if ((id = check_exact_event(event)) > 0
                || (id = check_similar_event_titles(event)) > 0
                || (id = check_exact_image(event)) > 0) {
            update_event(event, id);
        }
        ;
        create_new_event(event);
    }

But i cant see much gain in this way

Sign up to request clarification or add additional context in comments.

3 Comments

Since the code is repetitive, I would do it like this. This is much better if you need to update this logic, no need to copy paste the modif two times. Want to add a message, only need to add it once.
@AxelH see your point, maybe is my personal opinion, but for me assigning variables in this way might affect readability.
Of course, actually I wouldn't work in static at all for this one, check_*_event would return a boolean and store the id in a instance variable. The condition would only be check || check || check which is more readable and a would use the getter to send to updateEvent the correct id. A checkMethods should return a boolean for me, but this is also an opinion ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.