12

I need to build a Quran app and I want to read out the verses when a user touches it. The problem I'm facing is that some verses may expand to one and half lines (highlighted red verse) or just fit in a quarter of a line (highlighted green verse). So adding each verse to textview or some other view wont work it seems.

I want to detect verses like the red ones in the second image. I have audio files for the verses so no need of text to speech conversion

Lorem Ipsum
  • 5,944
  • 3
  • 34
  • 42
Aswin Anand
  • 123
  • 4
  • 1
    Is the page given as image data or rendered Unicode text? –  Apr 12 '12 at 06:22
  • Text to Speech.. But then, please show us your research. –  Apr 12 '12 at 06:22
  • I have tried adding each verses image to imageview dynamically, But problem arise is that image view wont expand to one and half line like that. Sometimes some verses may require one and half line, Hope you got what i meant. I need some suggestion to overcome this. –  Apr 12 '12 at 06:25
  • 1
    Use two lines and just have some overhead? –  Apr 12 '12 at 06:25
  • @Ghost i have seperate audio files, No need for conversion. I just want to know which verses is selected. And spread the verses in view as like in the screen shot of that app –  Apr 12 '12 at 06:26
  • @Lucas I want to start verses one after other not on separate lines –  Apr 12 '12 at 06:29
  • I dont know how to read the language this is written in, so when do verses start/end? Maybe I dont get what you are talking about. –  Apr 12 '12 at 06:30
  • @Lucas Even I dont know to read this :D, Saw the screen shot ? Highlighted thing in green is one verses, This is arabic so starts from left to right. Verses number is given in that box. Some verse may expand to one and half line, and i want to start display next verse where the first verses ended. How to do that? I want to know if the verse is selected also –  Apr 12 '12 at 06:36
  • Storing the X-Y cordinates for each 6239 verses is quite heavy task inorder to know which verses is selected. –  Apr 12 '12 at 06:38
  • 1
    By one and a half line do you mean it takes up the width of the screen, then half of the next line down? –  Apr 12 '12 at 06:38
  • @ Lucas yes that i meant. http://freepicninja.com/img.php?img=img/806824aa.jpg please see the image link i have given, I have highlighted in red to explain one and half line :) –  Apr 12 '12 at 06:48
  • That's actually a novel problem, I'll see if I can try it. I make no promises though. –  Apr 12 '12 at 07:01
  • @Thilo Sorry i missed your comment . Data we have on both UNICOde Text as well as image. We have audio also so no need of text to speech conversion. My problem is to display each verses as in that image and also know while it is selected –  Apr 12 '12 at 07:02
  • Display your image as you posted one in an ImageView and define android.graphics.Region with preprocessed coordinates that build an android.graphics.Path. Then set a TouchListener on your ImageView that checks which Region is clicked. The clicked region then indicates which sound bit to play. You also can use the region and path to overlay the currently played verse with a transparent color –  Apr 12 '12 at 07:21
  • There are 6239 verses, So any better solution rather than storing coordinates of each verses ? –  Apr 12 '12 at 07:31
  • Assume there are no more than 3 Verses per line. Then there are 12 points to store per line, which equals 12 * 2 * 2 bytes = 48 bytes per line (points stored in shorts). How many lines are there per image? Say 20: 960 bytes per image. How big is your image in a reasonable quality? Say 40kb. I guess it's worth this overhead, since you only need the regions and paths for the current image. –  Apr 12 '12 at 09:02

1 Answers1

9

This can be solved fairly straightforwardly with simple template matching. I don't know exactly how you have it set up, so I'll just describe the algorithm generally and use illustrations.

  • Observe that the verse numbers have a distinctive border that can easily be used to detect the start and end of a verse. So create a binarized template for that pattern and store it. Something like this:

    enter image description here

  • Since the number of lines in a screen are known in advance (you're formatting the page) and each verse has a constant height, you can easily infer (algorithmically) where the Y coordinates for the centerlines of the verses should be on the screen. This demonstrates the idea:

    enter image description here

  • When the user touches a verse, get the X-Y coordinates and snap the Y coordinate to the nearest verse center.

  • Then starting with the X coordinate, perform a simple template matching (cross-correlation) across that row. The first match (peak in the cross-correlation) in the forward direction (to the left), will be the end point for the verse. If there are no matches in the reverse direction (to the right), then move up one verse (which you can do, because you know the Y coordinate of the centerline) and repeat. The first match from the left end will be the start point of the verse. Similarly, if there is no forward match on the line, move down one line and repeat.

    Here's a short illustration of the idea. The yellow box is where the user touches the verse. You then do the cross-correlation with your template and the blue circles will be the match.

    enter image description here

    I also use template matching in this answer, if you're interested in seeing it in action.

  • Once you've determined the start point for the verse, then use an Arabic text recognizer to infer the verse number inside that border and play the corresponding audio file.


Simpler solution:

A simpler solution, if you don't want to go through this is to store the X-Y coordinates of the verse starting points (keep it simple and use the center points) and once you get the coordinates of the user input, you can again snap it to the centerline and then walk backwards to see where the verse starts. This might have the advantage of being faster.

I didn't put this forward as the first solution because you seemed to reject a similar idea in the comments. In the end, it depends on your constraints — would you rather do computational work (template matching — which, by the way, also requires you to store the template) or using memory (storing coordinates).

If I were you, I'd probably go with this one, but the image processing solution can be fun to try.

Lorem Ipsum
  • 5,944
  • 3
  • 34
  • 42
  • yoda or anyone else ,can you explain a bit better this issue please.?!Actually what i want to know and im not getting is just how or how to take numbers inside blue circles.?! Thanks a lot! –  Apr 24 '12 at 23:42
  • @xmenus Once you have locked in on the circle (which you do by pattern matching the distinctive border), you'll have to use an Arabic text/number recognition library for identifying the contents inside. I don't know Arabic and so can't recommend anything. You could try asking the OP... – Lorem Ipsum Apr 25 '12 at 00:17
  • @xmenus If you need help with Arabic you can ask me, but from what you are asking if you simply wanted to detect an Arabic number you could presumably do the same template matching for them as you are doing with the circles. – Spacey Apr 25 '12 at 01:42