[Enhancement]Create a “Plain Language Score”

### Category

new license or license feature/enhancement

### Feature

Develop a plain language score to reliably measure writing “plainness”

### Use Case

We could use it for educating and highlighting differences between licenses, rewarding plainness “in the wild” and to hold ourselves accountable with CI/CD scanning

### Benefit

Make the world a little bit easier to understand. 

### Alternatives

There are many measures of readability, such as Flesch-Kincaid, Gunning-Fog, SMOG. I don’t feel these wholly capture the important parts of clear and accessible writing. 

### Impact

Drive us to write more plainly, highlight efforts (or non-efforts) at plainness, and maybe even create a standard measure for measuring plain writing. 

### Resources

There’s a decent body of research on readability. 

### Contact Details

adam@plainlicense.org

### Additional Information

# Enhancement Proposal: Plain Language Score

We need a way to measure how easy licenses are to read. Let's call it the "Plain Language Score". Here's what I'm thinking:

## What's the idea?

We want to create a fair way to measure the readability of all licenses, including ours. This score would help us show how much clearer our licenses are compared to the originals. It could also show the benefits of plain writing and help us reward and recognize when people do it well. 

## What might the score look like?

Here are some ideas for what the score could measure:

1. Use of legal jargon
2. Sentence complexity
3. Organization (Use of headers, lists/bullets, and paragraph density)
4. Ratio of abstract to concrete words
5. A traditional readability score (like Flesch-Kincaid)

These are just ideas! We can change any part of this - what we measure, how much each part counts, or how we measure it. The main goal is to have a clear and fair way to score license readability. I also think there is an opportunity to modernize readability measures with NLP techniques. 

## How would we build it?

We'd make this a separate project from Plain License, but affiliated with it. It would have:

- Its own repository
- Version numbers for both the code and the scoring system
- Clear documentation on how it works

## How would we manage changes?

Once we're out of the testing phase, we should think about how we approve changes to the score. We want it to be reliable, so we can't just change it whenever we want. Maybe we could have a group of experts review big changes, or a proposal review process?

## What's next?

If you like this idea, we could:

1. Start a new repository for this project
2. Begin working on a basic version of the score
3. Test it on different licenses to see how it works
4. Ask for feedback from our community and legal experts

## Example: How it might work

Let's look at a quick example of how this score might work. We'll use a 1-20 scale, where lower is better (like grade levels).

Here's a snippet from a made-up license:

Original: "The Licensee shall not utilize the Software in any manner that may impair the integrity or performance of the system."

Our version: "You may not use the work in a way that could damage or slow down the service."

Let's score them:

1. Legal Jargon (30%):
   - Original: Lots of jargon (15/20)
   - Ours: No jargon (2/20)

2. Sentence Variety (25%):
   - Original: One complex sentence (14/20)
   - Ours: One simple sentence (5/20)

3. Document Organization (25%):
   - Both: N/A for this short example

4. Concrete Language (10%):
   - Original: Mostly abstract (16/20)
   - Ours: Mostly concrete (4/20)

5. Flesch-Kincaid (10%):
   - Original: Grade 16
   - Ours: Grade 6

Putting it all together (with some rounding):

Original: (15 * 0.3) + (14 * 0.25) + (16 * 0.1) + (16 * 0.1) = 10.9
Our version: (2 * 0.3) + (5 * 0.25) + (4 * 0.1) + (6 * 0.1) = 2.85

Final scores:
- Original: 11 (rounded up, about 11th grade)
- Our version: 3 (rounded up, about 3rd grade)
[my scoring system here clearly needs work]

Again, this is just an example - we can adjust how we calculate this as we develop the system. Some other things we could potentially include in our measure:
- ratio of active vs passive voice
- Frequency of conditional statements (e.g. if-then). I found these were the most difficult to translate to plain language.
- visual accessibility (as a component of organization). This could be a lot of things, to name a few: frequency of headers, length of median text blocks, frequency of bullets.
- Consistent word choice. How well it avoids using multiple synonyms in the same text.
- Clarity of definitions. If it includes a definitions section, I think there should be a higher penalty if the definitions suffer from the same inaccessible writing...

Let's discuss and refine this idea together!

We could also use it in our CI/CD process to enforce a target score, or identify parts of the site that we need to improve for clarity. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement]Create a “Plain Language Score” #1

Category

Feature

Use Case

Benefit

Alternatives

Impact

Resources

Contact Details

Additional Information

Enhancement Proposal: Plain Language Score

What's the idea?

What might the score look like?

How would we build it?

How would we manage changes?

What's next?

Example: How it might work

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Enhancement]Create a “Plain Language Score” #1

Description

Category

Feature

Use Case

Benefit

Alternatives

Impact

Resources

Contact Details

Additional Information

Enhancement Proposal: Plain Language Score

What's the idea?

What might the score look like?

How would we build it?

How would we manage changes?

What's next?

Example: How it might work

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions