12

I'm developing an Android app (Android 1.6), but this is probably a more general Java question.

I have an ArrayList of about 10,000 objects

the objects contain 3 strings (firstName, middleName, lastName).

The user is presented with a "search box" on android where they can search for a particular "object" by typing in part of the name.

I have a class (which I call Filterer) that searches through the list of 10,000 for matching objects and then returns them as a "sublist".

The search is a little bit SLOW (especially on an Android handset) and I'm sure I'm not doing the search/filtering in the most efficient manner possible.

Does anyone have any suggestions on how to speed up my search? My code is below. One possibility to to search against a secondary "masterList" that already has every piece of information in lowercase and concatenated…but there may be additional ways to improve this search that would also help.

TIA!!

public void filterNames() {
  this.filteredList.clear();
  String sv = this.searchString.toString.trim().toLowerCase(); // search value
  for (int i = 0; i < this.masterList.size(); i++) {
    MyObject d = this.masterList.get(i);
    String fn = d.getFirstName().toString().toLowerCase();
    String mn = d.getMiddleName().toString().toLowerCase();
    String ln = d.getLastName().toString().toLowerCase();

    if (fn.indexOf(sv) >= 0 || 
        md.indexOf(sv) >= 0 || 
        ln.indexOf(sv) >= 0) {
      this.currentList.add(d);
    }
  }
}
1
  • Look here for similar problem: stackoverflow.com/questions/2085445/… it is asked with c++ in mind, but general solution (data structures and algorithms) is language independent. Commented Jan 26, 2010 at 14:13

5 Answers 5

7

Yes, it's certainly painful to lower-case several objects for each loop iteration (plus a possibly redundant toString?), and also bad practice to call list.size() for every iteration — that value should be cached before the loop starts.

Anyway, if you're working with this much data, is there a reason that you're not using an SQLite database for storage and displaying/filtering your list using CursorAdapter?

That would be the recommended way to implement something of this size.

Sign up to request clarification or add additional context in comments.

4 Comments

Will SQLite (or other SQL DBMS) really help with infix search? Does it have special kind of index for that?
Local loop "size" variables are a Java Old Wives Tale, much like declaring methods "final." The JVM will inline the size() call and you will see no performance gains.
@Civil Disobedient: this is true for most JVMs, but not necessarily true for the Dalvik VM on Android devices. See developer.android.com/intl/fr/guide/practices/design/… for more info.
It works well enough for the filtering functionality Android Contacts application (or is it content provider?). That's probably a good place to have a look.
2

Maybe you can trade some space for some speed? Create some form of an index for your data?

For example:

  1. Create a list for each character (a-z) with all "MyObject"s where a part of the name contains the character (be aware of special characters!). For each entry count the number of "MyObject"s
  2. If a user type in an query, look for the individual characters and only search the list with the smallest amount of entries.

Of course the addition of an name would require you to add it to the index.

Comments

1

may be too late answer but it's help for other in stuck same problem.

Java 8 (2014) solves this problem using streams and lambdas in one line of code:

Using Stream Api you can filter data without for loop and more feature's are available.

List<MyObject> mFilteredMyObjectList = mMyObjectList.stream()
    .filter(d -> d.getFirstName().toString().toLowerCase().indexOf(sv) >= 0 
    || d.getMiddleName().toString().toLowerCase().indexOf(sv) >= 0 
    || d.getLastName().toString().toLowerCase().indexOf(sv) >= 0).collect(Collectors.toList());

For more info see below link,

Link1 Link2

1 Comment

Stream Api available for 24 and up by default.
0

After researching a bit more I have found that Suffix Arrays could get you the fasted answers. Also have a look at the Wikipedia entry for Suffix Trees for a bit more of an in depth explanation.
Besdies that I agree with the answer above that you could probably use an SQL Database for such queries. Doing an Sql Query against the data is probably one of the fastest ways to get what you want without suffix arrays.
One thing to speed things up a little without doing SQL would be to put firstName, middleName, lastName into one lowercase string, and put that into a new Map that is referencing the Array index. That way you can reduce search to just 10.000 strings of the hashmap without having to do a lowercase every time. It might be a bit faster but of course require more memory. Maybe try to do something with regular expressions to speed up the matching.
Another option would be to really create a searchindex with something like Lucene even though I think that would be really overkill on an Android device but could work in plain Java and infix search in Lucene isn't super high performance either.

5 Comments

Will SQLite (or other SQL DBMS) really help with infix search? Does it have special kind of index for that? As far as I know, standard SQL indexes are not designed to do fast infix (contains) searches.
Well it will definitely not be the fastest way, using a proper full text index would be faster. But I believe that doing the query in SQL Lite is faster than the search through the array
1) AFAIK full text search solutions (Lucene etc.) are not designed to accelerate infix searches. If you know that they are, please give link to article / documentation chapter about that. 2) What is your belief based on? Even SQL engine must iterate through all items (records) just like iterating through all items in arraylist will do. This is because of infix search involved, if it would be simpler kind of search (prefix search, exact value search etc.) - there would be serious gain on SQL by using index.
@WildWezyr: Yes an infix search is always expensive in time but Lucene supports it wiki.apache.org/lucene-java/… it won't be something like O(1) but it should at least be faster than the Java Code posted. Also the same is true for SQL since SQLite is running natively (C code) on Android I expect it to be faster than the dalvik code. If you really want to go all in and have the fastest search possible you need to go with something like a Suffix Tree or Suffix Array. Check Wikipedia for those.
1) lucene's faq entry you have referenced states that lucene merely supports it, but it is by default turned off, because it is too expensive. conclusion: lucene is not good solution for infix search. 2) native C code should (?) be faster than Java code - but it is not always true and it might not be true for this particular case. maybe it is worth giving a try (code modification to use SQL DB instead of ArrayList) but there is no certainty it will actually help here. 3) first i would try to optimize existing code (e.g. get rid of lowercasing etc.).
-1

How are you initially retrieving the 10,000+ list? If you're just using an instance of SQLite, I would really, strongly recommend you do this in SQL.

1 Comment

Will SQLite (or other SQL DBMS) really help with infix search? Does it have special kind of index for that? As far as I know, standard SQL indexes are not designed to do fast infix (contains) searches.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.