## Overview

The Holy Book Quran consists of **114** Chapters (Surah) and totally **6236** Verses (Ayah).
The Quran was orally revealed over a period of 23 years and written for the first time at the time
of Khulafaur Rasyidin. In the most printed editions, the Quran consists of 604 pages, which furtherly
organized into parts, notably a so-called Juz. There is 30 Juz in total.

For the results presented in this document, the Quran text is based on the Uthmani version published by
Tanzil project (http://tanzil.net/). Based on this text, the Quran is composed of **77430**
words and **325666** letters (note: un-numbered Basmallah at the beginning of 112 chapters
is not counted). For comparison, based on data published by corpus.quran.com
(Kais Dukes, University of Leeds) the figures are **77429** words and **623638**
(join letters and diacritics/harakah). The only difference is at QS 37:130 for the arabic words إِلْ يَاسِينَ
(trans: Prophet Elijah / Ilyas a.s.). Quran corpus counts it as one word, while it in Tanzil's version of
the Uthmani text is written as two words.

This document mainly presents some numbers and figures with minimal narration since it is meant to be
a quick reference for supporting further research in some aspects of The Noble Quran. Data used in
RDS-Q #4 is at **verse level** (length of **6236**) i.e. data related to
words and letters are pre-processed for the each verse, while data for juz and page is aggregated
correspondingly.

## All Verses - C,V,C+V

The 6236-length verse data is formed orderly by sequential chapter number (C) and sequential verse number (V). The Figure below clearly shows this arragement. The change in C is marked by resetting V to the base value of 1, forming a sawtooth-like graph. Length of the chapter is the distance beetween two points at this base value in the x-axis.

The following Figures give some histogram plots for C, V and the value of C+V. For each Figure, the number of bins are varied to see the behaviour of the parameters in different number of groups. While all parameters have shown some degree of possible patterns, the most interesting ones can be seen in the histograms of V and C+V.

The last Figure in this section depicts standard density functions for each parameters.

The following Table gives some statistical values for C,V and C+V parameters.

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Chapter No. (C) | 114 | 51.0 | 26.0 | 11.0 | 1 | 33.52 | 26.46 | 6236 | 209029 |

Verse No. (V) | 286 | 75.0 | 38.0 | 16.0 | 1 | 53.51 | 50.46 | 6236 | 333667 |

C+V | 288 | 107.0 | 83.0 | 58.0 | 2 | 87.03 | 43.98 | 6236 | 542696 |

## All Verses - W,L

The following five Figures focus on the number of words (W) and letters (L).

The first Figure depicts linear plot of W over verse sequence. The second Figure gives some higtogram of W for different number of bins.

The third Figure depicts linear plot of L over verse sequence. The fourth Figure gives some higtogram of L for different number of bins. The last Figure below shows density plot of both W and L.

The following Table gives some statistical values for W and L.

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Num. of Words | 128 | 16.0 | 10.0 | 6.0 | 1 | 12.42 | 9.42 | 6236 | 77430 |

Num. of Letters | 547 | 68.0 | 43.0 | 23.0 | 2 | 52.22 | 39.25 | 6236 | 325666 |

## Split [1-3118] & [3119-6236]

For some investigations we might want to look at a segment of the data. The following Figure shows density functions of C,V and C+V for the segment [1-3118] and [3119-6236].

The following Table gives some statistical values for C,V and C+V for segmen [1-3118].

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Chapter No. (C) | 26 | 19.0 | 11.0 | 5.0 | 1 | 12.44 | 7.8 | 3118 | 38779 |

Verse No. (V) | 286 | 103.0 | 64.0 | 31.0 | 1 | 74.05 | 55.08 | 3118 | 230879 |

C+V | 288 | 116.0 | 78.0 | 45.0 | 2 | 86.48 | 53.19 | 3118 | 269658 |

The following Table gives some statistical values for C,V and C+V for segmen [3119-6236].

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Chapter No. (C) | 114 | 72.0 | 51.0 | 37.0 | 26 | 54.6 | 21.23 | 3118 | 170250 |

Verse No. (V) | 227 | 44.0 | 23.0 | 10.0 | 1 | 32.97 | 34.88 | 3118 | 102788 |

C+V | 253 | 103.0 | 85.0 | 67.0 | 28 | 87.57 | 32.26 | 3118 | 273038 |

The following Table gives some statistical values for W and L for segmen [1-3118].

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Num. of Words | 128 | 19.0 | 13.0 | 9.0 | 1 | 15.46 | 10.02 | 3118 | 48210 |

Num. of Letters | 547 | 82.0 | 55.0 | 37.0 | 2 | 65.0 | 41.79 | 3118 | 202681 |

The following Table gives some statistical values for W and L for segmen [3119-6236].

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Num. of Words | 78 | 13.0 | 7.0 | 4.0 | 1 | 9.37 | 7.64 | 3118 | 29220 |

Num. of Letters | 330 | 53.0 | 29.0 | 17.0 | 2 | 39.44 | 31.75 | 3118 | 122985 |

## Odd/Even - Chapters & Verses

For symmetry investigations we might want to look at a segment of the data which is based on certain criteria, in this case: odd/even C+V criteria as reported in RDS-Q #1 and RDS-Q #2. The following Figure shows density functions of C,V and C+V for the odd and even segment.

As shown above, the odd and even parameters share (almost) the same density curve. This is due to the nature of sequential number of both chapter number C and verse number V (within the same chapter). From the shape perspective, all curves are similar to those without data split except that the amplitude is lower.

The following Table gives some statistical values for C,V and C+V for **odd** segmen.

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Chapter No. (C) | 114 | 51.0 | 26.0 | 11.0 | 1 | 33.52 | 26.43 | 3118 | 104516 |

Verse No. (V) | 285 | 75.0 | 38.0 | 16.0 | 1 | 53.54 | 50.48 | 3118 | 166926 |

C+V | 287 | 107.0 | 83.0 | 57.0 | 3 | 87.06 | 44.0 | 3118 | 271442 |

The following Table gives some statistical values for C,V and C+V for **even** segmen.

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Chapter No. (C) | 114 | 50.75 | 26.0 | 11.0 | 1 | 33.52 | 26.49 | 3118 | 104513 |

Verse No. (V) | 286 | 75.0 | 38.0 | 16.0 | 1 | 53.48 | 50.45 | 3118 | 166741 |

C+V | 288 | 106.0 | 82.0 | 58.0 | 2 | 87.0 | 43.97 | 3118 | 271254 |

## Odd/Even - Words & Letters

The following Figure shows density functions of W and L for the odd and even C+V segment.

It is quite interesting that the density curves of both W and L for odd/even segment are almost identical. Differ to sequential parameter C and V previously mentioned, W and L are entirely derived from the verse text. The values in the table below are also interesting and have to some extent justified these curves.

The number of words for the odd/even group is **38716** / **38714**.
The number of letters for the odd/even group is **162821** / **162845**.
Both sum of W and L are almost halved.
Note that if we are using data from corpus.quran.com, the number of words W for the odd group is **38715**,
while that of the even group is **38714**.

The following Table gives some statistical values for W and L for **odd** segmen.

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Num. of Words | 78 | 16.0 | 10.0 | 5.0 | 1 | 12.42 | 9.54 | 3118 | 38716 |

Num. of Letters | 344 | 68.75 | 42.0 | 23.0 | 2 | 52.22 | 39.92 | 3118 | 162821 |

The following Table gives some statistical values for W and L for **even** segmen.

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Num. of Words | 128 | 16.0 | 10.0 | 6.0 | 1 | 12.42 | 9.3 | 3118 | 38714 |

Num. of Letters | 547 | 68.0 | 43.0 | 24.0 | 2 | 52.23 | 38.58 | 3118 | 162845 |

## Pages - Chapters & Verses

As briefly mentioned in the overview, verse level data can be aggregated according to pages or juz. The Figure below depicts the number of chapters and verses for each page. Sure, the most pages contain only verses of a single chapter. The sum of the number of chapters includes repetitions.

The following Table gives some statistical values for C and V for pages data.

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Num. of Chapters | 3 | 1.0 | 1.0 | 1.0 | 1 | 1.1 | 0.33 | 604 | 662 |

Num. of Verses | 42 | 11.0 | 8.0 | 7.0 | 1 | 10.32 | 6.18 | 604 | 6236 |

## Pages - Words & Letters

Data can be further view to the level of words and letters. The following Figure gives the number of words and letters for each page.

The following Table gives some statistical values for W and L for pages data.

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Num. of Words | 161 | 137.0 | 129.0 | 121.0 | 29 | 128.2 | 14.92 | 604 | 77430 |

Num. of Letters | 693 | 573.0 | 543.0 | 514.0 | 139 | 539.18 | 58.83 | 604 | 325666 |

## Juz - Pages, Chapters & Verses

The following two Figures illustrate the number of pages (P), chapters (C) and verses (V) for each Juz.

The plot of P and C are given in the first Figure, while the number of V is in the second one. As we might already be realized, the most of Juz consists of 20 pages.

The following Table gives some statistical values for P, C and V for Juz data.

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Num. of Pages | 23 | 20.0 | 20.0 | 20.0 | 20 | 20.27 | 0.64 | 30 | 608 |

Num. of Chapters | 37 | 4.0 | 3.0 | 2.0 | 1 | 4.5 | 6.55 | 30 | 135 |

Num. of Verses | 564 | 220.75 | 170.5 | 143.5 | 110 | 207.87 | 107.57 | 30 | 6236 |

Note that a chapter can span Juz boundaries and a page can contain end/start of Juz simultaneously (i.e. page: 62, 121, 201, 502).

## Juz - Words & Letters

The following Figure gives the number of words and letters for each Juz.

The following Table gives some statistical values for W and L for Juz data.

max | 75% | 50% | 25% | min | mean | std | count | sum | |
---|---|---|---|---|---|---|---|---|---|

Num. of Words | 2774 | 2640.25 | 2596.5 | 2520.5 | 2308 | 2581.0 | 100.27 | 30 | 77430 |

Num. of Letters | 11497 | 11050.0 | 10900.0 | 10727.75 | 9704 | 10855.53 | 349.25 | 30 | 325666 |

## Resources

URL | |
---|---|

Data | https://github.com/eueung/rds-q/tree/master/data |

https://github.com/eueung/rds-q/tree/master/PDF | |

Project (All) | https://github.com/eueung/rds-q/ |

Web | https://quran.telematika.org/00004/quran-statistics-6236.html |

Web (All) | https://quran.telematika.org/ |

## Sample Data

ve_no_g | ch_no | ve_no | page | juz | t_w_nb | t_c_nb | cav | cavoe |
---|---|---|---|---|---|---|---|---|

3791 | 37 | 3 | 446 | 23 | 2 | 11 | 40 | even |

3381 | 29 | 41 | 401 | 20 | 19 | 88 | 70 | even |

2091 | 17 | 62 | 288 | 15 | 15 | 63 | 79 | odd |

2485 | 21 | 2 | 322 | 17 | 11 | 42 | 23 | odd |

545 | 4 | 52 | 87 | 5 | 11 | 43 | 56 | even |

2457 | 20 | 109 | 319 | 16 | 12 | 43 | 129 | odd |

3384 | 29 | 44 | 401 | 20 | 10 | 44 | 73 | odd |

5091 | 57 | 16 | 539 | 27 | 28 | 117 | 73 | odd |

4640 | 50 | 10 | 518 | 26 | 5 | 21 | 60 | even |

2180 | 18 | 40 | 298 | 15 | 15 | 62 | 58 | even |

5504 | 74 | 9 | 575 | 29 | 4 | 16 | 83 | odd |

2725 | 23 | 52 | 345 | 18 | 8 | 32 | 75 | odd |

839 | 6 | 50 | 133 | 7 | 28 | 103 | 56 | even |

1262 | 9 | 27 | 191 | 10 | 12 | 40 | 36 | even |

2465 | 20 | 117 | 320 | 16 | 12 | 48 | 137 | odd |

20 | 2 | 13 | 3 | 1 | 19 | 80 | 15 | odd |

3500 | 31 | 31 | 414 | 21 | 19 | 68 | 62 | even |

4593 | 48 | 10 | 512 | 26 | 25 | 104 | 58 | even |

5392 | 70 | 17 | 569 | 29 | 4 | 16 | 87 | odd |

1541 | 11 | 68 | 229 | 12 | 12 | 45 | 79 | odd |

829 | 6 | 40 | 132 | 7 | 15 | 61 | 46 | even |

4525 | 46 | 15 | 504 | 26 | 45 | 186 | 61 | odd |

358 | 3 | 65 | 58 | 3 | 15 | 67 | 68 | even |

4596 | 48 | 13 | 512 | 26 | 9 | 42 | 61 | odd |