Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update namespace.cc: re-implement the member function NormalizePath #778

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ToWorld
Copy link
Contributor

@ToWorld ToWorld commented Jan 9, 2017

Update namespace.cc: re-implement the member function NormalizePath. There are some tests below.
image
算法说明:
`算法伪代码:
std::string NameSpace::NormalizePath(const std::string& path) {
// 存放标准化后的路径
std::string ret;
// 处理边界条件
if (path.empty() || path[0] != '/') {
ret = "/";
}
// 算法的主循环
// 每次连续处理两个字符
// 为什么选择连续处理两个字符,而不是一个,或者三个或者更多
// 这个主要和函数所要处理的问题相关,本函数主要是想将(用户)
// 输入的路径进行标准化,如:
// "home" --> "/home"
// "" --> "/"
// "///" --> "/"
// "home//baidu///" --> "/home/baidu"
// etc.
// 那么本算法的主要是想就是,每次处理连续两个字符,
// 并且当前索引index指向连续两个字符的后一个字符
// 那么此时index的值初始化为1
for (uint32_t index = 1; i < path.size(); ) {
// 若当前索引对应的字符不是'/', 那么我们可以完全将当前
// 连续的两个字符存储起来
if (path[index] != '/') {
ret.push_back(path[index-1]);
ret.push_back(path[index]);
// 此时index需要跳2段
index += 2;
} else {
// 当前索引对应的字符是'/',那么我们就需要知道,当前
// 字符的左边和右边各是什么字符,首先我们处理当前字符的
// 左边的字符,若不是'/', 存储,否则忽略,
// 而当前字符的后一个字符是什么,当前字符不能知道,则当前
// 字符先不处理,此时跳1段
if (path[index-1] != '/') {
ret.push_back(path[index-1]);
}
index++;
}
}
// 很容易知道,上面的主循环由于索引会跳2段,或者索引初始化为1,会导致源字符串中的
// 最后一个字符不会被处理,比如:
// "a", "/", "/aa", "///"
// 最后我们处理这种情况
// 这个时候只需要处理一个可能,判断最后一个字符是否为'/',不是则存储,是的话,则忽略。
// 但是这里会遗漏一个test,即"/",我们在后面还会加一个来弥补
if (index == len && path[index-1]!='/') {
ret.push_back(path[index-1]);
}
if (ret.empty()) {
ret = "/";
}
return ret;
}

`
下面说说为什么这个算法比每次处理一个字符效率要高,

  1. 首先一个用户输入的路径,基本上就是可能是"home/"; "//", "/baidu/"等情况,不太可能会遇到"//////////"这种情况,
    若遇到后面一种极端的情况,那么本算法可能就和每次处理一个字符效率相当了。所以我们还是考虑正常的情况,这个时候
    若非"/"字符所占的比例越高,每次处理两个字符的效率就越高,不过永远不会超越每次处理一个字符的50%,因为本算法每次
    都是处理两个字符。
    经过初步测试,效率的确提高不少。
    欢迎大家提出更好的方法。

Update namespace.cc: re-implement the member function NormalizePath, which increased efficiency by at least 10%
slash = true;
} else {
slash = false;
if (path[i-1] == '/') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

代码风格

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I get it.

@ToWorld ToWorld closed this Jan 10, 2017
@ToWorld ToWorld reopened this Jan 10, 2017
@ToWorld ToWorld closed this Jan 10, 2017
@ToWorld ToWorld deleted the patch-6 branch February 9, 2017 07:40
@ToWorld ToWorld restored the patch-6 branch February 22, 2017 08:08
@ToWorld ToWorld reopened this Feb 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants