Skip to content

Commit 7520277

Browse files
authored
Add documentation and examples for group parameter in Table.hist (#641)
* docstring:added description of the group parameter to Table.hist * tutorial: added example of using group parameter in hist * Add 'group' example to hist section in datascience-reference
1 parent 3edff0b commit 7520277

File tree

3 files changed

+53
-15
lines changed

3 files changed

+53
-15
lines changed

datascience/tables.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5285,7 +5285,8 @@ def hist(self, *columns, overlay=True, bins=None, bin_column=None, unit=None, co
52855285
grouped by the values in this column, and a separate histogram is
52865286
generated for each group. The histograms are overlaid or plotted
52875287
separately depending on the overlay argument. If None, no such
5288-
grouping is done.
5288+
grouping is done. Note: `group` cannot be used together with `bin_column` or when plotting
5289+
multiple columns. An error will be raised in these cases.
52895290
52905291
side_by_side (bool): Whether histogram bins should be plotted side by
52915292
side (instead of directly overlaid). Makes sense only when

docs/reference-nb/datascience-reference.ipynb

Lines changed: 38 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -207,9 +207,7 @@
207207
{
208208
"cell_type": "code",
209209
"execution_count": 32,
210-
"metadata": {
211-
"scrolled": false
212-
},
210+
"metadata": {},
213211
"outputs": [
214212
{
215213
"data": {
@@ -2194,7 +2192,9 @@
21942192
{
21952193
"cell_type": "code",
21962194
"execution_count": 80,
2197-
"metadata": {},
2195+
"metadata": {
2196+
"scrolled": true
2197+
},
21982198
"outputs": [
21992199
{
22002200
"data": {
@@ -2211,6 +2211,34 @@
22112211
"actors.hist(\"Gross\")"
22122212
]
22132213
},
2214+
{
2215+
"cell_type": "markdown",
2216+
"metadata": {},
2217+
"source": [
2218+
"### Using `group` with `Table.hist`\n",
2219+
"\n",
2220+
"You can also group the histogram by a categorical column using `group=`:\n",
2221+
"\n",
2222+
"The number of columns must be one, and you can't use `bin_column` with `group`."
2223+
]
2224+
},
2225+
{
2226+
"cell_type": "code",
2227+
"execution_count": null,
2228+
"metadata": {},
2229+
"outputs": [],
2230+
"source": [
2231+
"students = Table().with_columns(\n",
2232+
" 'Score', np.concatenate([\n",
2233+
" np.random.normal(75, 10, 500), # Group X: higher average score\n",
2234+
" np.random.normal(65, 10, 500) # Group Y: lower average score\n",
2235+
" ]),\n",
2236+
" 'Group', ['X'] * 500 + ['Y'] * 500 # Assign 500 Xs and 500 Ys\n",
2237+
")\n",
2238+
"\n",
2239+
"students.hist('Score', group='Group') # Plot histogram grouped by 'Group'\n"
2240+
]
2241+
},
22142242
{
22152243
"cell_type": "markdown",
22162244
"metadata": {},
@@ -4234,9 +4262,7 @@
42344262
{
42354263
"cell_type": "code",
42364264
"execution_count": 100,
4237-
"metadata": {
4238-
"scrolled": false
4239-
},
4265+
"metadata": {},
42404266
"outputs": [
42414267
{
42424268
"data": {
@@ -4516,9 +4542,7 @@
45164542
{
45174543
"cell_type": "code",
45184544
"execution_count": 112,
4519-
"metadata": {
4520-
"scrolled": false
4521-
},
4545+
"metadata": {},
45224546
"outputs": [
45234547
{
45244548
"data": {
@@ -6469,7 +6493,7 @@
64696493
],
64706494
"metadata": {
64716495
"kernelspec": {
6472-
"display_name": "Python 3",
6496+
"display_name": "Python 3 (ipykernel)",
64736497
"language": "python",
64746498
"name": "python3"
64756499
},
@@ -6483,9 +6507,9 @@
64836507
"name": "python",
64846508
"nbconvert_exporter": "python",
64856509
"pygments_lexer": "ipython3",
6486-
"version": "3.8.1"
6510+
"version": "3.13.3"
64876511
}
64886512
},
64896513
"nbformat": 4,
6490-
"nbformat_minor": 2
6491-
}
6514+
"nbformat_minor": 4
6515+
}

docs/tutorial.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -258,6 +258,19 @@ Draw histograms with :meth:`~datascience.tables.Table.hist`:
258258
@savefig hist_overlay.png width=4in
259259
normal_data.hist(bins = range(-5, 10), overlay = True)
260260
261+
Draw grouped histograms with the ``group`` argument:
262+
263+
.. ipython:: python
264+
265+
grouped = Table().with_columns(
266+
'value', np.random.normal(size=100),
267+
'group', np.random.choice(['A', 'B'], size=100)
268+
)
269+
270+
@savefig hist_group.png width=4in
271+
grouped.hist('value', group='group')
272+
Note: group cannot be used together with bin_column, and does not support multiple histogram columns.
273+
261274
If we treat the ``normal_data`` table as a set of x-y points, we can
262275
:meth:`~datascience.tables.Table.plot` and
263276
:meth:`~datascience.tables.Table.scatter`:

0 commit comments

Comments
 (0)