Skip to content

Conversation

@marler8997
Copy link
Contributor

@marler8997 marler8997 commented Jan 25, 2018

This PR modifies dmd to verify that the name of a module always matches the name that was used to import it. Depending on the situation, if the names do not match then this PR will either provide a deprecation message or assert an error.

  1. DEPRECATED: import a module with a qualified name but the module being imported has no module declaration
// main.d
import foo.bar;
// foo/bar.d
// note the missing module declaration
 Deprecated: module 'bar' from file foo/bar.d should be imported with 'import bar;'
  1. DEPRECATED: import a module with a qualified name that partially matches the name of the module being imported
// main.d
import foo.bar;
import foo.baz.buz;
// foo/bar.d
module bar;
// foo/baz/buz.d
module baz.buz;
 Deprecated: module 'bar' from file foo/bar.d should be imported with 'import bar;'
 Deprecated: module 'baz.buz' from file foo/baz/buz.d should be imported with 'import baz.buz;'

NOTE: for this rule to apply, the ends of the imported name and the module name itself must match.

  1. ERROR: importing a module that matches the filename but does not match the module name
// main.d
import foo;
// foo.d
module bar;
 Error: module 'bar' from file foo.d must be imported with 'import bar;'
  1. NOT APPLICABLE/STILL WORKS: import a module with the correct module name but does not match the filename
// main.d
import bar;
// foo.d
module bar;

NOTE: this also addresses a bug found by @timotheecour

#7777 (comment)

NOTE: The DMD test suite itself was using some of this functionality in a fair amount of tests, i.e.

  • test/runnable/imports/link7745b.d
  • test/fail_compilation/imports/a11919.d

@dlang-bot
Copy link
Contributor

dlang-bot commented Jan 25, 2018

Thanks for your pull request and interest in making D better, @marler8997! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the annotated coverage diff directly on GitHub with CodeCov's browser extension
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • My PR follows the DStyle
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Auto-close Bugzilla Severity Description
15086 critical import doesn't verify module declaration

// COMPILED_IMPORTS: imports/nomodname.d
// REQUIRED_ARGS: -Ifail_compilation
// PERMUTE_ARGS:
import imports.nomodname; No newline at end of file
Copy link
Contributor

@timotheecour timotheecour Jan 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no newline ad EOF (likewise for other files in this PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a bad thing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's a convention (lots of editor show an annoying marker when we forget one

@@ -0,0 +1 @@
// the point of this module is that it has no module name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// intentionally no module declaration

@marler8997 marler8997 force-pushed the dashIBugFix branch 2 times, most recently from d029af5 to b1300ce Compare January 25, 2018 21:18
foreach (i; 0 .. left.dim)
if ((*left)[i] != (*right)[i])
return false;
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bug? with right.dim > left.dim

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shoot, woops

@marler8997 marler8997 mentioned this pull request Jan 25, 2018
@marler8997 marler8997 force-pushed the dashIBugFix branch 10 times, most recently from 0bf7c98 to 29b87ab Compare January 25, 2018 23:39
@marler8997 marler8997 changed the title Bug fix for corner case using -i Bug fix: dmd allows importing modules with wrong name Jan 25, 2018
@marler8997 marler8997 force-pushed the dashIBugFix branch 2 times, most recently from c038071 to 2c62ea4 Compare January 25, 2018 23:56

// verify the module name matches the imported module name
if (!packages.equals((m.md is null) ? null : m.md.packages))
m.error(loc, "from file %s must be imported with 'import %s;'", m.srcfile, m.toPrettyChars());
Copy link
Contributor

@JinShil JinShil Jan 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, I think this will introduce a regression, but probably one D needs.

  • This needs to be specified in the specification, if it's not already.
  • This probably needs to go through the deprecation process. That stinks, but it probably needs to be done.
  • I think this will require a changelog entry

That's just my opinion/perception. I'd be happy to fast-track this as I think it's the right thing to do to enforce reliable usage of the language, but that's not my call.

@JinShil
Copy link
Contributor

JinShil commented Jan 26, 2018

Also, I think we require bugzilla entries for all bug fixes.

@timotheecour
Copy link
Contributor

@JinShil found this old bug which might be related: https://issues.dlang.org/show_bug.cgi?id=15086

@JinShil
Copy link
Contributor

JinShil commented Jan 26, 2018

found this old bug

Ok, if this PR addresses that bug, prepend "Fix Isssue xxxx" to this PR's title and to the commit message so @dlang-bot will pick it up.

@jacob-carlborg
Copy link
Contributor

Is this anything related to declaring a module with a different name than the file? Because that was used as a feature in DMD, before the Objective-C integration was refactored. Previously there were two files, objc.d and objc_stubs.d, both used the same module declaration: module ddmd.objc. This allowed to swap the implementation in the makefiles based on the platform without having to change any imports. See 9fe35b1 for reference. This is not used anymore and only one objc.d file exists.

@marler8997
Copy link
Contributor Author

No activity for 11 days. I haven't tried to push anyone else to look at this since I'd rather my other PR's receive attention since they need to be merged quickly before the next release, but I have plenty of time if anyone has time to review/approve/request changes.

Copy link
Contributor

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marler8997 this LGTM. There's still a small nit from @timotheecour, but apart from that I'm just waiting here for @andralex or @WalterBright to have a brief look as seen on the testsuite it might trigger a couple of deprecation messages. However, it's important to realize that (1) it's not breaking any code yet (we never need to follow through with the deprecation and (2) this is in accordance with the D spec.

Copy link
Member

@andralex andralex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this PR forces all modules imported via a path to use a module declaration. Is this correct? If so, it's too much.

@@ -1,3 +1,4 @@
module imports.a313templatemixin1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? We should continue to work if the module declaration is not present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's one way to go. You could also add the imports directory to the "import path list", i.e.

dmd -Icompilable/imports ...

then no module declaration is required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marler8997 can't we just penalize people who enter wrong module information? It seems this PR does that but also adds another restriction, which is more difficult to justify.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not sure exactly what you're trying to say. There's 2 cases listed in the original PR message that have been deprecated, and 1 that has been declared to be an error. Do those use cases categorize the cases you are thinking of? If so, which ones do you think should not be deprecated/made errors?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marler8997 yah let me discuss exactly on the cases you present in the changelog. Thanks for the neat organization.

@marler8997
Copy link
Contributor Author

It seems this PR forces all modules imported via a path to use a module declaration. Is this correct? If so, it's too much.

I'm not sure exactly what you mean. There's a couple different cases. If you import a module using an "unqualified name", then no module declaration is required. However, if you import a module with a "qualified name", and that module has no module declaration, then this PR will print a deprecation message. i.e.

// main.d
import foo.bar.baz.buzz;

// foo/bar/baz/buzz.d
// no module declaration (deprecation message)

@andralex
Copy link
Member

@marler8997 if I understand correctly you impose that no module could be imported through two different paths. Isn't that a bit excessive?

@marler8997
Copy link
Contributor Author

@marler8997 if I understand correctly you impose that no module could be imported through two different paths. Isn't that a bit excessive?

2 different paths is ok, it's 2 different module names that this is preventing.

@marler8997
Copy link
Contributor Author

marler8997 commented Feb 12, 2018

// main.d
import foo.bar.baz.buz;
// other.d
import baz.buz;
// foo/bar/baz/buz.d

Are you saying you think this should be OK?


Some instances of module name/import name mismatches were deprecated while others are now considered errors.

Case 1: DEPRECATED: import a module name with a qualified name but the module being imported has no module declaration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This effectively imposes that modules inside directories (frequent situation) must have a module declaration. Moreover, the module declaration makes the module contents dependent on the directory in which it resides, thus closing a circular dependency between the file content and its position in the file system. Whenever the module is moved around, it needs to also be touched.

These does not seem desirable qualities. On the contrary, it was appreciated that a given module can be accessed via different packages (within e.g. different projects), via directory aliases, etc. It also seems makes it more difficult to define and use hierarchical packages.

What advantages does introducing this restriction have?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This effectively imposes that modules inside directories (frequent situation) must have a module declaration.

Actually it just means that they must be imported with a "non-qualified" name.

---
The above code will now print:

$(CONSOLE Deprecated: module bar from file foo/bar.d should be imported with 'import bar;'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively: replace module declaration with module foo.bar;

We need to show advantages of this as well. For example show how the current rules lead to a breach in modularity.

My current mental model goes like this: main.d uses a dotted import to guide path selection, and that also confers a local name for the imported module. The fact that the module gave itself a non-dotted name is irrelevant to main.d. I see nothing wrong in the fact that the module name from the perspective of code inside foo/bar.d is different from the module name from the perspective of code inside main.d. In fact we have a language construct that messes exactly with that:

import takezat = foo.bar;

We'd need a good argument that the above is a bad thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. That's not how I originally understood the module system to work, but your description matches the implementation that I found.

One disadvantage of the current mechanism is that this causes the compiler to load the same module multiple times. Each unique path used to import the module will result in another instance of the module being instantiated. I think there's more to say but I'll leave it at that for now.

The above code will now print:

$(CONSOLE Deprecated: module bar from file foo/bar.d should be imported with 'import bar;'
Deprecated: module baz.buz from file foo/baz/buz.d should be imported with 'import baz.buz;' )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar line of reasoning would apply here. We need to build a case on why that's bad.

Case 3: ERROR: import a module that matches the filename but does not match the module name.
---
// main.d

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to add some content to main.d?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, should be import foo;

---
The above code will now fail and print:

$(CONSOLE Error: module bar from file foo.d must be imported with 'import bar;')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how the file would ever be found. So the compiler is supposed to compile main.d. It sees import bar;. How does it know it's supposed to look for a file called foo.d?

Copy link
Contributor Author

@marler8997 marler8997 Feb 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how the file would ever be found.

The file is give on the command line.

Note that module names not matching file names is a technique already used (see #7778 (comment)) , however, this PR does enforce that the import name matches the actual module name.

@andralex
Copy link
Member

Are you saying you think this should be OK?

@marler8997 what would be the directory structure and compiler flags? Would those be in the same project? (If different projects, I can definitely say that should be okay.)

Overall: this diff renders a use of module names used in a bunch of our own tests as an antipattern. I think this is a serious sign we need to scrutinize the matter appropriately.

@marler8997
Copy link
Contributor Author

Overall: this diff renders a use of module names used in a bunch of our own tests as an antipattern. I think this is a serious sign we need to scrutinize the matter appropriately.

Definitely. I welcome the scrutiny. It looks like we'll have to explore this more, maybe I can come up with some use cases that demonstrate the advantages of these restrictions.

@marler8997
Copy link
Contributor Author

marler8997 commented Feb 12, 2018

@marler8997 what would be the directory structure and compiler flags? Would those be in the same project? (If different projects, I can definitely say that should be okay.)

I've created the small example project I mentioned and depending on how I compile it/if I add/remove module names, etc. I end up with all kinds of weird errors. Some cases I get a linker error, some cases I get a compilation error, and some cases I don't get any error, but each case is very non-intuitive and requires intimiate knowledge of how the compiler works to understand what's going on.

I argue that the ability to import a module using different names severely breaks the module system. Normally if people play nice and don't do weird things then it's ok, but this PR is preventing those weird abuses which in my opinion have no utility and only serve to add confusion to the module system.

@marler8997
Copy link
Contributor Author

marler8997 commented Feb 12, 2018

Let's look at one of the weird things that can happen with the current state of the "unverified" module system. Take the following 2 files:

// main.d
import foo.bar;
void main() { barfunc(); }
// foo/bar.d
void barfunc() { }

If you compile "foo/bar.d" seperately like this:

dmd -c foo/bar.d

Then you can successfully compile the full program via

dmd main.d bar.o

However, if you compile it fully from source you will get a compilation error:

dmd main.d foo/bar.d
Error: module bar from foo/bar.d must be imported with `import bar;`

This error occurs becase the module foo/bar.d is being loaded twice with 2 different names. The first time it is loaded because it is given on the command line, and since it has no module declaration, the compiler gives it the name bar. But then the module gets imported with the name foo.bar, however, the compiler cannot change the name of the module and all existing reference to it so an error must be asserted. In the first case where the module is precompiled, the module only gets loaded once from the import so it will be given the proper name of foo.bar and the compilation succeeds.

This difference in behavior couples the module system with how the files are compiled. Compiling some combinations of files will succeed while others will not.

This PR breaks this coupling by forbidding this behavior altogether. No matter what combination you compile the files in, you will always get the following error:

Error: module bar from foo/bar.d must be imported with `import bar;` or given a name with `module foo.bar;`

This example demonstrates the core problem with supporting the first 2 cases that this PR aims to deprecate. However, note that this example is just one manifestation of the core problem. The same core problem could also cause different files to be imported from the same import statement, which can cause nonsensical compilation/linker errors which are very difficult to root cause.

@marler8997
Copy link
Contributor Author

P.S. Many of the tests in the DMD test suite will only compile if you compile the imports separately from the test file itself. If you compile them all together in one invocation, you will get an import error. This demonstrates a real example where not verifying import/module names puts a requirement on what order/how you compile your modules together.

@WalterBright
Copy link
Member

The fact that this PR requires 80-90 files in the test suite to be edited casts serious doubt on the viability of this change w.r.t. breaking existing code.

@marler8997
Copy link
Contributor Author

Editing all those files is not required. I'll modify the PR to remove those changes to demonstrate.

@wilzbach
Copy link
Contributor

Better submit a new PR, s.t. you don't loose these changes ;-)

@marler8997
Copy link
Contributor Author

Since it looks like it's going to take some work to reach consensus on cases 1 and 2, I'm going to split this PR. I'm going to push a new PR that only addresses case 3 (which addresses issue 15086). Once that is merged I can create a new PR that addresses case 1 and 2. This limits the scope of each PR making them easier to discuss and prevents conflation of the individual points for each case.

@marler8997 marler8997 changed the title Fix issue 15086: import doesn't verify module declaration Deprecate import/module name mismatch (ALL CASES) Feb 12, 2018
@marler8997
Copy link
Contributor Author

Closing this one in favor of (#7878) and potentially a new PR in the future for cases 1 and 2.

@marler8997 marler8997 closed this Feb 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.