How to run newer code on older Java?

If you ever had a situation when you were forced to stick with specific Java version due to dependant library version or (ekhem) licensing restriction, you can now breathe a sigh of relief: due to fantastic JVM community it is possible to avoid this issue! I will show you how to achieve this in simple, ready to use, project.

Read more!

All you need to know about String memory performance in Java in 2019

Did you know that, according to Java implementators, about 25% of memory consumed by large-scale applications are Strings? And what if I tell you that you can decrease this value with a single command?

Read more!

Genetic Ranker - genetic algorithms for search

Working as a search engineer myself I decided to develop a framework for finding optimal query weights for search engines like Elasticsearch or Solr. It is based on a machine learning branch called genetic programming, inspired by the process of natural selection. In this post I’ll describe it and briefly discuss how the good process of building the quality of search should look like. Let’s start!

Read more!

How to change Solr standard tokenizer rules?

Although Solr comes with standard tokenizer implementation, which is well prepared to tokenize most of the texts, there are cases when it is helpless. Imagine a document with many numbers, of which many are followed by percentage sign. In a certain contexts it is expected to distinguish queries that refer to those percentages & plain numbers. How to achieve that? We need a custom tokenizer.

Read more!

Extract entities from document with Solr Text Tagger

Algorithms for recognizing entities from text are ones of the most crucial aspects of text analysis. They lead to better understanding of the content, enable additional operations like filtering or grouping and - most importantly - allow to process data automatically. In the previous post I announced combination of text indexing & such extraction and in order to keep my promise I created a fork of Solr Text Tagger.

Read more!

How to add data to Solr document during indexing?

The process of indexing in Solr in an advanced topic covered by many publications. On the most basic level it can be described as putting data into previously prepared containers. But what if user wants to perform additional data processing depending on documents that already are in the index?

Read more!